It's very important that the GPU build does not include any system
directories. We currently use `-ffreestanding` to disable a lot of these
features, but we can still accidentally include them if they are not
provided by `libc` yet. This patch changes this to use `-nostdinc` to
disable all standard search paths. Then we use the compiler's resource
directory to pick up the provided headers like `stdint.h`.
Differential Revision: https://reviews.llvm.org/D159445
If there are copy instructions between uaddlv and dup for transfer from gpr to
fpr, try to remove them with duplane.
Differential Revision: https://reviews.llvm.org/D159267
Using https://github.com/actions/labeler, this add a workflow to
automatically label PRs, in hope to reduce the work needed to triage new
PRs.
"new-prs-labeler.yml" has been seeded taking inspiration from the
CODEOWNERS file when there was an existing corresponding label on the
issue tracker.
This patch:
* Replaces `andymckay/labeler` which does not appear to be maintained by
github official solution
* Removes the closed issue workflow which was disabled a few years ago
and never fixed.
* Adds a few rules to add label based on PR title, hopefully that can
make triaging simpler. If that turns out to be useful, we can consider
adding more rules for backends, etc. We could technically also pattern
match the body of the issue but I'm concerned about trying to be _too_
clever.
The new system is only triggered on PR open so manual labels should not
be removed.
Functional-style cast (i.e. a simple-type-specifier or typename-specifier
followed by a parenthesize single expression [expr.type.conv]) is equivalent
to the C-style cast, so that makes sense they have identical behavior
including warnings.
This also matches GCC https://godbolt.org/z/b8Ma9Thjb.
Reviewed By: rnk, aaron.ballman
Differential Revision: https://reviews.llvm.org/D159133
In some cases where the same mask is used for multiple
extending masked loads it can be more efficient to combine
the zero- or sign-extend into the load even if it's not a
legal or custom operation. This leads to splitting up the
extending load into smaller parts, which also requires
splitting the mask. For SVE at least this improves the
performance of the SPEC benchmark x264 slightly on
neoverse-v1 (~0.3%), and at least one other benchmark
improves by around 30%. The uplift for SVE seems due to
removing the dependencies (vector unpacks) introduced
between the loads and the vector operations, since this
should increase the level of parallelism.
See tests:
CodeGen/AArch64/sve-masked-ldst-sext.ll
CodeGen/AArch64/sve-masked-ldst-zext.ll
https://reviews.llvm.org/D159191
I've added some missing tests for the following cases:
1. Zero- and sign-extends from unpacked vector types to wide,
illegal types. For example,
%aext = zext <vscale x 4 x i8> %a to <vscale x 4 x i64>
2. Normal loads combined with 1
3. Masked loads combined with 1
Differential Revision: https://reviews.llvm.org/D159192
This always succeeds. While I'm here, document why we check the size
of p0 against the value of VG.
Reviewed By: omjavaid
Differential Revision: https://reviews.llvm.org/D157845
Rename CheckBaseDerived to something more general and call it in
GetPtrField() as well, so we don't crash later in Pointer::toAPValue().
Differential Revision: https://reviews.llvm.org/D149149
This adds some commonly-used instruction aliases from various sources:
- GNU
- SPARCv9 manual
- JPS1 ASR names
Reviewed By: barannikov88
Differential Revision: https://reviews.llvm.org/D157236
This adds named ASI tag constants (such as #ASI_P and #ASI_P_L) for memory
accesses.
This patch adds 64-bit/V9 tag names, given that currently the majority of SPARC
software targets that arch.
Support for 32-bit/V8 tag names will be added in a future patch.
Reviewed By: barannikov88
Differential Revision: https://reviews.llvm.org/D157235
There are two motivations for this change:
1. It considerably simplifies adding support for the realloc operation to the
new buffer deallocation pass by lowering the realloc such that no
deallocation operation is inserted and the deallocation pass itself can
insert that dealloc
2. The lowering is expressed on a higher level and thus easier to understand,
and the lowerings of the memref operations it is composed of don't have to
be duplicated in the MemRefToLLVM lowering (also see discussion in
https://reviews.llvm.org/D133424)
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D159430
We can implement these similarly to DerivedToBase casts. We just have to
walk the class hierarchy, sum the base offsets and subtract it from the
current base offset of the pointer.
Differential Revision: https://reviews.llvm.org/D149133
This completes the support for the CAS instructions.
Besides the base CASA and CASXA forms, on v9 the aliases CAS, CASX, CASL, and
CASXL are also available.
Reviewed By: barannikov88
Differential Revision: https://reviews.llvm.org/D157234
This extends support for ASI-tagged loads, stores, and swaps with the new
stored-ASI form ([reg+imm] %asi) introduced in v9.
CAS instructions are handled differently by the (dis-)assembler, so it will be
handled in a separate patch.
Reviewed By: barannikov88
Differential Revision: https://reviews.llvm.org/D157233
While both SPARCv7/v8 and v9 has a register named %fq, they encode it
differently, so we need to differentiate between them.
Reviewed By: barannikov88
Differential Revision: https://reviews.llvm.org/D157232
Memref descriptors contain an `offset` field that denotes the start of
the content of the memref relative to the `alignedPtr`. This offset is
not considered when converting a memref descriptor to a np.array in the
Python runtime library, essentially treating all memrefs as if they had
an offset of zero. This patch introduces the necessary pointer arithmetic
to find the actual beginning of the memref contents to the memref->numpy
conversion functions.
There is an ongoing discussion about whether the `offset` field is needed
at all in the memref descriptor.
Until that is decided, the Python runtime and CRunnerUtils should
still correctly implement the offset handling.
Related: https://reviews.llvm.org/D157008
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D158494
This patch adds a new class Identifier to store identifiers in PresburgerSpace
and their types.
Identifiers were added earlier and were stored as a void pointer, and their type
in the form of mlir::TypeId in PresburgerSpace. To get an identifier, a user of
PresburgerSpace needed to know the type of identifiers. This was a problem for
users of PresburgerSpace like IntegerRelation, which want to work on
identifiers without knowing their type.
The Identifier class allows users like IntegerRelation to work on identifiers
without knowing their type, and also exposes an easier way to work with
Identifiers.
Reviewed By: arjunp
Differential Revision: https://reviews.llvm.org/D146909
Fix a typo (incorrectly calling getNumDomainVars instead of
getNumRangeVars) in intersectRange from 3dd9931.
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D159413
In dataflow analysis, SAT solver: simplify formula during CNF construction and short-cut
solving when the formula has been recognized as contradictory.
Reviewed By: sammccall
Differential Revision: https://reviews.llvm.org/D158407
Adds a new feature to MIR patterns: builtin instructions.
They offer some additional capabilities that currently cannot be expressed without falling back to C++ code.
There are two builtins added with this patch, but more can be added later as new needs arise:
- GIReplaceReg
- GIEraseRoot
Depends on D158714, D158713
Reviewed By: arsenm, aemerson
Differential Revision: https://reviews.llvm.org/D158975
Now that the old backend is gone, clean-up a few things that no longer make sense and tidy up the file a bit.
Depends on D158710
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D158714
Remove CodeGen leftovers from the old combiner backend and adapt the API to fit the new backend better.
It's now quite a bit closer to how InstructionSelector works.
- `CombinerInfo` is now a simple "options" struct.
- `Combiner` is now the base class of all TableGen'd combiner implementation.
- Many fields have been moved from derived classes into that class.
- It has been refactored to create & own the Observer and Builder.
- `tryCombineAll` TableGen'd method can now be renamed, which allows targets to implement the actual `tryCombineAll` call manually and do whatever they want to do before/after it.
Note: `CombinerHelper` needs to be mutable because none of its methods are const. This can be revisited later.
Depends on D158710
Reviewed By: aemerson, dsanders
Differential Revision: https://reviews.llvm.org/D158713
There has to be a blank line after a code block. Otherwise the HTML
docs can't be built.
The problem was brought in by ddc80637cc. How careless I was that the
same patch broke the build twice.
mffsl is available since ISA 3.0. The builtin is named with ppc prefix
to follow our convention. For targets earlier than power9, GCC generates
extra code to support the functionality, while this patch does not
implement such behavior.
Reviewed By: nemanjai, tuliom
Differential Revision: https://reviews.llvm.org/D158065
Now strings that are too long for one line in C#, Java, JavaScript, and
Verilog get broken into several lines. C# and JavaScript interpolated
strings are not broken.
A new subclass BreakableStringLiteralUsingOperators is used to handle
the logic for adding plus signs and commas. The updateAfterBroken
method was added because now parentheses or braces may be required after
the parentheses or commas are added. In order to decide whether the
added plus sign should be unindented in the BreakableToken object, the
logic for it is taken out into a separate function
shouldUnindentNextOperator.
The logic for finding the continuation indentation when the option
AlignAfterOpenBracket is set to DontAlign is not implemented yet. So in
that case the new line may have the wrong indentation, and the parts may
have the wrong length if the string needs to be broken more than once
because finding where to break the string depends on where the string
starts.
The preambles for the C# and Java unit tests are changed to the newer
style in order to allow the 3-argument verifyFormat macro. Some cases
are changed from verifyFormat to verifyImcompleteFormat because those
use incomplete code and the new verifyFormat function checks that the
code is complete.
The line in the doc was changed to being indented by 4 spaces, that is,
the default continuation indentation. It has always been the case. It
was probably a mistake that the doc showed 2 spaces previously.
This commit was fist committed as 16ccba5107. The tests caused
assertion failures. Then it was reverted in 547bce3613.
Reviewed By: MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D154093
When the caret location is lower than the lowest source range, clang is
printing wrong line numbers. The first line number should consider caret
location line even when there are source ranges provided.
Current wrong line example: https://godbolt.org/z/aj4qEjzs4