RISCV strong prefers i32 values be sign extended to i64. This combine
was always zero extending the constant using APInt methods.
This adjusts the code so that it calls getNode using ISD::ANY_EXTEND instead.
getNode will call TLI.isSExtCheaperThanZExt to decide how to handle
the constant.
Tests were copied from D121598 where I noticed that we were creating
constants that were hard to materialize.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D121650
This code handles fixed vector SPLAT_VECTOR, but is never called in
any tests.
We only form fixed vector splat vectors for vXi64 on RV32 as part
of DAGCombine. This will be type legalized to SPLAT_VECTOR_PARTS.
So the Custom handling for SPLAT_VECTOR is never needed.
This patch makes SPLAT_VECTOR for vXi64 'Legal' on RV32 so that
DAGCombine will create it, but there's no need for Custom handler.
It will still be type legalized to SPLAT_VECTOR_PARTS.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D121673
Terminators are handled specially in the transfer functions so we need an
additional check on whether the analysis has disabled built-in transfer
functions.
Differential Revision: https://reviews.llvm.org/D121694
This can be viewed as swapping the select arms:
https://alive2.llvm.org/ce/z/jUvFMJ
...so we don't have the 'nsz' problem with the more general fold.
This unlocks other folds for the motivating fabs example.
This was discussed in issue #38828.
Prior to this patch, there was no distinction between tests that check
basic assertions and tests that check full-fledged iterator debugging
assertions. Both were disabled when support for the debug mode is not
provided in the dylib, which is stronger than it needs to be.
Furthermore, all of the tests using "debug_macros.h" that contain more
than one assertion in them were broken -- any code after the first
assertion would never be executed.
This patch refactors all of our assertion-related tests to:
1. Be enabled whenever they can, i.e. basic assertions tests are run
even when the debug mode is disabled.
2. Use the superior `check_assertion.h` (previously `debug_mode_helper.h`)
instead of `debug_macros.h`, which allows multiple assertions in the
same program.
3. Coalesce some tests into the same file to make them more readable.
4. Use consistent naming for test files -- no more db{1,2,3,...,10} tests.
This is a large but mostly mechanical patch.
Differential Revision: https://reviews.llvm.org/D121462
ISel for experimental.vp.strided.load|store for v256.32 types via
lowering to vvp_load|store SDNodes.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D121616
When using `--convert-func-to-llvm=emit-c-wrappers` the attribute arguments of the wrapper would not be created correctly in some cases.
This patch fixes that and introduces a set of tests for (hopefully) all corner cases.
See https://github.com/llvm/llvm-project/issues/53503
Author: Sam Carroll <sam.carroll@lmns.com>
Co-Author: Laszlo Kindrat <laszlo.kindrat@lmns.com>
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D119895
Implement expm1f function that is correctly rounded for all rounding modes. This is based on expf implementation.
From exhaustive testings, using expf implementation, and subtract 1.0 before rounding the final result to single precision
gives correctly rounded results for all |x| > 2^-4 with 1 exception. When |x| < 2^-25, we use x + x^2 (implemented with a
single fma). And for 2^-25 <= |x| <= 2^-4, we use a single degree-8 minimax polynomial generated by Sollya.
Reviewed By: sivachandra, zimmermann6
Differential Revision: https://reviews.llvm.org/D121574
Patch adds a new operation for the SIMD construct. The op is designed to be very similar to the existing `wsloop` operation, so that the `CanonicalLoopInfo` of `OpenMPIRBuilder` can be used.
Reviewed By: shraiysh
Differential Revision: https://reviews.llvm.org/D118065
Fix darwin interface test after D121464. asan_rtl_x86_64.S is not
available on Darwin.
Reviewed By: kstoimenov
Differential Revision: https://reviews.llvm.org/D121636
The decision which categories are relevant for a particular test run
happen very early in the test setup process. They use the SBPlatform
object to determine which categories should be skipped. The platform
object created for this purpose transcends individual test runs.
This setup is not compatible with the direction discussed in
<https://discourse.llvm.org/t/multiple-platforms-with-the-same-name/59594>
-- when platform objects are tied to a specific (SB)Debugger, they need
to be created alongside it, which currently happens in the test setUp
method.
This patch is the first step in that direction -- it rewrites the
category skipping logic to avoid depending on a global SBPlatform
object. Fortunately, the skipping logic is fairly simple (and I believe
it outght to stay that way) and mainly consists of comparing the
platform name against some hardcoded lists. This patch bases this
comparison on the platform name instead of the os part of the triple (as
reported by the platform).
Differential Revision: https://reviews.llvm.org/D121605
This improves the modularity of the bufferization.
From now on, all ops that do not implement BufferizableOpInterface are considered hoisting barriers. Previously, all ops that do not implement the interface were not considered barriers and such ops had to be marked as barriers explicitly. This was unsafe because we could've hoisted across unknown ops where it was not safe to hoist.
As a side effect, this allows for cleaning up AffineBufferizableOpInterfaceImpl. This build unit no longer needed and can be deleted.
Differential Revision: https://reviews.llvm.org/D121519
This replaces the attempt in 20af71f8ec47319d375a871db6fd3889c2487cbd to use combineToExtendBoolVectorInReg to create X86ISD::BLENDV masks directly, instead we use it to canonicalize the iX bitcast to a sign-extended mask and then truncate it back to vXi1 prior to legalization breaking it apart.
Fixes#53760
Hardcode the function type as ParallelTask, which is the guaranteed
pointee type of this runtime function argument (if pointee types
exist). The elimination of the callee bitcast is left for InstCombine.
Differential Revision: https://reviews.llvm.org/D120885
`semantics::IsSaved()` was not applying -Msave/-fno-automatic for main programs.
This caused issues since lowering relies on it to allocate static
variables. This did not match nvfortran/gfortran behaviors where
-fno-automatic/-Msave control the static allocation of scalars in
main programs.
Some program may rely on main program scalars to be statically allocated in
bss (and therefore initialized to zero) with -Msave/-fno-automatic
flags.
Differential Revision: https://reviews.llvm.org/D121603
Also add a TODO to switch to a custom walk instead of the GreedyPatternRewriter, which should be more efficient. (The bufferization pattern is guaranteed to apply only a single time for every op, so a simple walk should suffice.)
We currently specify a top-to-bottom walk order. This is important because other walk orders could introduce additional casts and/or buffer copies. These canonicalize away again, but it is more efficient to never generate them in the first place.
Note: A few of these canonicalizations are not yet implemented.
Differential Revision: https://reviews.llvm.org/D121518
Update places still referencing LoopVectorBody to use the vector loop to
get the vector loop header. This is needed to move vector loop
code-generation to VPlan completely, which in turn is needed to model
pre-header & exit blocks in VPlan as well.
We are going to remove the old 'perfect shuffle' optimization since it
brings performance penalty in hot loop around vectors. For example, in
following loop sharing the same mask:
%v.1 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>
%v.2 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>
The generated instructions will be `vmrglw-vmrghw-vmrglw-vmrghw` instead
of `vperm-vperm`. In some large loop cases, this causes 20%+ performance
penalty.
The original attempt to resolve this is to pre-record masks of every
shufflevector operation in DAG, but that is somewhat complex and brings
unnecessary computation (to scan all nodes) in optimization. Here we
disable it by default. There're indeed some cases becoming worse after
this, which will be fixed in a more careful way in future patches.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D121082
There is currently an awkwardly complex set of rules for how a
parser/printer is generated for AttrDef/TypeDef. It can change depending on if a
mnemonic was specified, if there are parameters, if using the assemblyFormat, if
individual parser/printer code blocks were specified, etc. This commit refactors
this to make what the attribute/type wants more explicit, and to better align
with how formats are specified for operations.
Firstly, the parser/printer code blocks are removed in favor of a
`hasCustomAssemblyFormat` bit field. This aligns with the operation format
specification (and is nice to remove code blocks from ODS).
This commit also adds a requirement to explicitly set `assemblyFormat` or
`hasCustomAssemblyFormat` when the mnemonic is set and the attr/type
has no parameters. This removes the weird implicit matrix of behavior,
and also encourages the author to make a conscious choice of either C++
or declarative format instead of implicitly opting them into the C++
format (we should be pushing towards declarative when possible).
Differential Revision: https://reviews.llvm.org/D121505
The current documentation is super old, crusty, and at times wrong. This commit
rewrites the documentation to focus on the TableGen declarative definition,
expounds on various components, and moves the doc out of Tutorials/ and into
a new top level `AttributesAndTypes.md` doc. As part of this, the AttrDef/TypeDef
documentation in OpDefinitions.md is removed.
Differential Revision: https://reviews.llvm.org/D120011
OpBase.td has formed into a huge monolith of all ODS constructs. This
commits starts to rectify that by splitting out some constructs to their
own .td files.
Differential Revision: https://reviews.llvm.org/D118636
This patch adds support for custom directives in attribute and type formats. Custom directives dispatch calls to user-defined parser and printer functions.
For example, the assembly format "custom<Foo>($foo, ref($bar))" expects a function with the signature
```
LogicalResult parseFoo(AsmParser &parser, FailureOr<FooT> &foo, BarT bar);
void printFoo(AsmPrinter &printer, FooT foo, BarT bar);
```
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D120944