1130 Commits

Author SHA1 Message Date
Bjorn Pettersson
6d1749e6c0 [SelectionDAG] Fix miscompile bugs related to smul.fix.sat with scale zero
When expanding a SMULFIXSAT ISD node (usually originating from
a smul.fix.sat intrinsic) we've applied some optimizations for
the special case when the scale is zero. The idea has been that
it would be cheaper to use an SMULO instruction (if legal) to
perform the multiplication and at the same time detect any overflow.
And in case of overflow we could use some SELECT:s to replace the
result with the saturated min/max value. The only tricky part
is to know if we overflowed on the min or max value, i.e. if the
product is positive or negative. Unfortunately the implementation
has been incorrect as it has looked at the product returned by the
SMULO to determine the sign of the product. In case of overflow that
product is truncated and won't give us the correct sign bit.

This patch is adding an extra XOR of the multiplication operands,
which is used to determine the sign of the non truncated product.

This patch fixes PR51677.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D108938

(cherry picked from commit 789f01283d52065b10049b58a3288c4abd1ef351)
2021-08-31 20:59:28 -07:00
Alexandros Lamprineas
276fcebbe0 [AArch64] Legalize MVT::i64x8 in DAG isel lowering
This patch legalizes the Machine Value Type introduced in D94096 for loads
and stores. A new target hook named getAsmOperandValueType() is added which
maps i512 to MVT::i64x8. GlobalISel falls back to DAG for legalization.

Differential Revision: https://reviews.llvm.org/D94097
2021-08-02 15:45:58 +01:00
Fraser Cormack
2df597c58a [SelectionDAG] Support scalable-vector splats in yet more cases
This patch extends support for (scalable-vector) splats in the
DAGCombiner via the `ISD::matchBinaryPredicate` function, which enable a
variety of simple combines of constants.

Users of this function may now have to distinguish between
`BUILD_VECTOR` and `SPLAT_VECTOR` vector operands. The way of dealing
with this in-tree follows the approach added for
`ISD::matchUnaryPredicate` implemented in D94501.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D106575
2021-07-26 10:15:08 +01:00
Simon Pilgrim
cb0d04c29e [DAG] Add initial SelectionDAG::isGuaranteedNotToBeUndefOrPoison framework (PR51129)
I've setup the basic framework for the isGuaranteedNotToBeUndefOrPoison call and updated DAGCombiner::visitFREEZE to use it, further Opcodes can be handled when we have test coverage.

I'm not aware of any vector test freeze coverage so the DemandedElts (and the Depth) args are not being used yet - but they are in place.

SelectionDAG::isGuaranteedNotToBePoison wrappers have also been added.

Differential Revision: https://reviews.llvm.org/D106668
2021-07-24 11:36:35 +01:00
Roman Lebedev
fb2a58c6ab [NFCI][TLI] prepare[US]REMEqFold(): don't add nonsensical 'exact' flag to rotates created
As pointed out by Craig Topper.
2021-07-22 23:02:58 +03:00
Roman Lebedev
19d08eac86 [TLI] prepareSREMEqFold(): use correct VT for the final VSELECT (PR51133)
We were using the wrong VT for this final VSELECT,
it should be in the final comparison VT,
not the source value's VT.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51133
2021-07-19 16:44:00 +03:00
Arthur Eubanks
266a9a84be [OpaquePtr] Use ArgListEntry::IndirectType for lowering ABI attributes
Consolidate PreallocatedType and ByValType into IndirectType, and use that for inalloca.
2021-07-07 14:58:38 -07:00
Bradley Smith
f53921a516 [SelectionDAG] Implement PromoteIntRes_INSERT_SUBVECTOR
Inserting into a smaller-than-legal scalable vector would result in an
internal compiler error. For example, inserting a <vscale x 4 x i8> into
a <vscale x 8 x i8> (both illegal vector types for SVE) would cause a
crash.

This crash was happening because there was no code to promote (legalise)
the result of an INSERT_SUBVECTOR node.

This patch implements PromoteIntRes_INSERT_SUBVECTOR, which legalises
the ISD node. This is currently done by going through memory. This is
necessary because of the requirement that the SubVec parameter of the
INSERT_SUBVECTOR node must be smaller than the Vec parameter, which
means that INSERT_SUBVECTOR cannot always have a legal result/operand
types.

Co-Authored-by: Joe Ellis <joe.ellis@arm.com>

Differential Revision: https://reviews.llvm.org/D102766
2021-07-01 17:05:53 +01:00
Bradley Smith
ef187efe53 [TargetLowering][AArch64][SVE] Take into account accessed type when clamping address
When clamping the index for a memory access to a stacked vector we must
take into account the entire type being accessed, not just assume that
we are accessing only a single element.

Differential Revision: https://reviews.llvm.org/D105016
2021-06-30 13:30:18 +01:00
Martin Storsjö
9d14adb9f6 [llvm] Rename StringRef _lower() method calls to _insensitive()
This is a mechanical change. This actually also renames the
similarly named methods in the SmallString class, however these
methods don't seem to be used outside of the llvm subproject, so
this doesn't break building of the rest of the monorepo.
2021-06-25 00:22:01 +03:00
Craig Topper
c1782c733f [TargetLowering][ARM] Don't alter opaque constants in TargetLowering::ShrinkDemandedConstant.
We don't constant fold based on demanded bits elsewhere in
SimplifyDemandedBits, so I don't think we should shrink them either.

The affected ARM test changes because a constant become non-opaque
and eventually enabled some constant folding. This no longer happens.
I checked and InstCombine is able to simplify this test. I'm not sure exactly
what it was trying to test.

Reviewed By: lebedev.ri, dmgreen

Differential Revision: https://reviews.llvm.org/D104832
2021-06-24 10:09:36 -07:00
Roman Lebedev
8401a57c05 [TLI] SimplifyDemandedVectorElts(): handle SCALAR_TO_VECTOR(EXTRACT_VECTOR_ELT(?, 0))
Iff we have `SCALAR_TO_VECTOR` (and we demand it's only defined 0'th element),
and said scalar was produced by `EXTRACT_VECTOR_ELT` from the 0'th element
of some vector, then we can just continue traversal into said source vector.

This comes up in X86 vector uniform shift lowering.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D104250
2021-06-14 23:52:53 +03:00
Arthur Eubanks
4cc1a31398 Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry"
Needs to be discussed more.

This reverts commit 255a5c1baa6020c009934b4fa342f9f6dbbcc46
This reverts commit df2056ff3730316f376f29d9986c9913b95ceb1
This reverts commit faff79b7ca144e505da6bc74aa2b2f7cffbbf23
This reverts commit d2a9020785c6e02afebc876aa2778fa64c5cafd
2021-06-07 16:07:44 -07:00
Arthur Eubanks
b01ec7e228 [TargetLowering] Only inspect attributes in the arguments for ArgListEntry
Parameter attributes are considered part of the function [1], and like
mismatched calling conventions [2], we can't have the verifier check for
mismatched parameter attributes.

Issues can be diagnosed with D103412.

[1] https://llvm.org/docs/LangRef.html#parameter-attributes
[2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D101806
2021-06-03 15:52:01 -07:00
Arthur Eubanks
f9c1930dea Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry"
This reverts commit 1c7f32334d4becc725b9025fd32291a0e5729acd.

Some code still needs to properly set parameter ABI attributes, see
D101806.
2021-05-29 23:08:15 -07:00
Arthur Eubanks
85767d0682 Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering"
This reverts commit bc7d15c61da78864b35e3c114294d6e4db645611.

Dependent change is to be reverted.
2021-05-29 22:40:33 -07:00
Arthur Eubanks
2682845f03 [NFC] Use ArgListEntry indirect types more in ISel lowering
For opaque pointers, we're trying to avoid uses of
PointerType::getElementType().

A couple of ISel places use PointerType::getElementType(). Some of these
are easy to fix by using ArgListEntry's indirect types.

The inalloca type wasn't stored there, as opposed to preallocated and
byval which have their indirect types available, so add it and use it.

This is a reland after an MSan fix in D102667.

Differential Revision: https://reviews.llvm.org/D101713
2021-05-18 14:30:22 -07:00
Arthur Eubanks
b8775b2a78 [TargetLowering] Only inspect attributes in the arguments for ArgListEntry
Parameter attributes are considered part of the function [1], and like
mismatched calling conventions [2], we can't have the verifier check for
mismatched parameter attributes.

This is a reland after fixing MSan issues in D102667.

[1] https://llvm.org/docs/LangRef.html#parameter-attributes
[2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D101806
2021-05-18 14:30:22 -07:00
Simon Pilgrim
633bafaf5f [TargetLowering] prepareUREMEqFold/prepareSREMEqFold - account for non legal shift types
Ensure we tell getShiftAmountTy that we're working with pre-legalized types to prevent cases where the (legalized) shift type can no longer handle the (non-legalized) type width.

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34366
2021-05-17 11:03:27 +01:00
Arthur Eubanks
99d811bdc5 Revert "[TargetLowering] Only inspect attributes in the arguments for ArgListEntry"
This reverts commit 16748bd2fb1fe10d7d097961f1988327338f3f9f.

Causes https://crbug.com/1209013
2021-05-16 22:02:10 -07:00
Arthur Eubanks
630c86e151 Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering"
This reverts commit 85af8a8c1b574faa0d5d57d189ae051debdfada8.
2021-05-16 22:00:54 -07:00
Tim Northover
5661b7eb80 IR+AArch64: add a "swiftasync" argument attribute.
This extends any frame record created in the function to include that
parameter, passed in X22.

The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001
in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect
of this is that tools walking the stack should expect to see one of three
values there:

  * 0b0000 => a normal, non-extended record with just [FP, LR]
  * 0b0001 => the extended record [X22, FP, LR]
  * 0b1111 => kernel space, and a non-extended record.

All other values are currently reserved.

If compiling for arm64e this context pointer is address-discriminated with the
discriminator 0xc31a and the DB (process-specific) key.

There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing
front-ends access to this slot (and forcing its creation initialized to nullptr
if necessary).
2021-05-14 11:43:58 +01:00
Arthur Eubanks
2278264558 [NFC] Use ArgListEntry indirect types more in ISel lowering
For opaque pointers, we're trying to avoid uses of
PointerType::getElementType().

A couple of ISel places use PointerType::getElementType(). Some of these
are easy to fix by using ArgListEntry's indirect types.

The inalloca type wasn't stored there, as opposed to preallocated and
byval which have their indirect types available, so add it and use it.

Differential Revision: https://reviews.llvm.org/D101713
2021-05-10 13:05:15 -07:00
Arthur Eubanks
3be78fa191 [TargetLowering] Only inspect attributes in the arguments for ArgListEntry
Parameter attributes are considered part of the function [1], and like
mismatched calling conventions [2], we can't have the verifier check for
mismatched parameter attributes.

[1] https://llvm.org/docs/LangRef.html#parameter-attributes
[2] https://llvm.org/docs/FAQ.html#why-does-instcombine-simplifycfg-turn-a-call-to-a-function-with-a-mismatched-calling-convention-into-unreachable-why-not-make-the-verifier-reject-it

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D101806
2021-05-10 12:35:11 -07:00
Simon Pilgrim
742cb688cc [DAG] Add a generic expansion for SHIFT_PARTS opcodes using funnel shifts
Based off a discussion on D89281 - where the AARCH64 implementations were being replaced to use funnel shifts.

Any target that has efficient funnel shift lowering can handle the shift parts expansion using the same expansion, avoiding a lot of duplication.

I've generalized the X86 implementation and moved it to TargetLowering - so far I've found that AARCH64 and AMDGPU benefit, but many other targets (ARM, PowerPC + RISCV in particular) could easily use this with a few minor improvements to their funnel shift lowering (or the folding of their target ops that funnel shifts lower to).

NOTE: I'm trying to avoid adding full SHIFT_PARTS legalizer handling as I think it might actually be possible to remove these opcodes in the medium-term and use funnel shift / libcall expansion directly.

Differential Revision: https://reviews.llvm.org/D101987
2021-05-07 13:12:30 +01:00
Craig Topper
50539da1c1 [SelectionDAG] Use a VTSDNode to store the saturation width for FP_TO_SINT_SAT/FP_TO_UINT_SAT
Previously we used an i32 constant to store the saturation width, but i32 isn't
legal on RISCV64. This wasn't a big deal to fix, but it is extra work for the
type legalizer.

This patch uses a VTSDNode to store the type similar to SEXT_INREG. This makes
it opaque to the type legalizer.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101262
2021-04-27 14:38:42 -07:00
Dávid Bolvanský
7c4c0f3460 [Analysis] Attribute alignment should not prevent tail call optimization
Fixes tail folding issue mentioned in D100879.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D101230
2021-04-24 19:57:42 +02:00
Simon Pilgrim
2a2d25a7be [DAG] TargetLowering.cpp - breakup if-else chains where each block returns. NFCI.
Match style guide that requests that if+return blocks are separate.
2021-04-21 11:17:27 +01:00
David Sherwood
40e085b6bd [CodeGen] Improve code generation for clamping of constant indices with scalable vectors
When trying to clamp a constant index into a scalable vector we can
test if the index is less than the minimum number of elements in the
vector. If so, we can simply return the index because we know it is
guaranteed to fit inside the vector.

Differential Revision: https://reviews.llvm.org/D100639
2021-04-19 08:34:17 +01:00
Serge Guelton
b2fe6dcbc2 Normalize interaction with boolean attributes
Such attributes can either be unset, or set to "true" or "false" (as string).
throughout the codebase, this led to inelegant checks ranging from

        if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")

to

        if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true")

Introduce a getValueAsBool that normalize the check, with the following
behavior:

no attributes or attribute set to "false" => return false
attribute set to "true" => return true

Differential Revision: https://reviews.llvm.org/D99299
2021-04-17 08:17:33 +02:00
Momchil Velikov
d98e321d12 [clang][AArch64] Correctly align HFA arguments when passed on the stack
When we pass a AArch64 Homogeneous Floating-Point
Aggregate (HFA) argument with increased alignment
requirements, for example

    struct S {
      __attribute__ ((__aligned__(16))) double v[4];
    };

Clang uses `[4 x double]` for the parameter, which is passed
on the stack at alignment 8, whereas it should be at
alignment 16, following Rule C.4 in
AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules)

Currently we don't have a way to express in LLVM IR the
alignment requirements of the function arguments. The align
attribute is applicable to pointers only, and only for some
special ways of passing arguments (e..g byval). When
implementing AAPCS32/AAPCS64, clang resorts to dubious hacks
of coercing to types, which naturally have the needed
alignment. We don't have enough types to cover all the
cases, though.

This patch introduces a new use of the stackalign attribute
to control stack slot alignment, when and if an argument is
passed in memory.

The attribute align is left as an optimizer hint - it still
applies to pointer types only and pertains to the content of
the pointer, whereas the alignment of the pointer itself is
determined by the stackalign attribute.

For byval arguments, the stackalign attribute assumes the
role, previously perfomed by align, falling back to align if
stackalign` is absent.

On the clang side, when passing arguments using the "direct"
style (cf. `ABIArgInfo::Kind`), now we can optionally
specify an alignment, which is emitted as the new
`stackalign` attribute.

Patch by Momchil Velikov and Lucas Prates.

Differential Revision: https://reviews.llvm.org/D98794
2021-04-15 22:58:14 +01:00
Simonas Kazlauskas
c1d491f5a6 Support {S,U}REMEqFold before legalization
This allows these optimisations to apply to e.g. `urem i16` directly
before `urem` is promoted to i32 on architectures where i16 operations
are not intrinsically legal (such as on Aarch64). The legalization then
later can happen more directly and generated code gets a chance to avoid
wasting time on computing results in types wider than necessary, in the end.

Seems like mostly an improvement in terms of results at least as far as x86_64 and aarch64 are concerned, with a few regressions here and there. It also helps in preventing regressions in changes like {D87976}.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D88785
2021-04-01 01:35:41 +03:00
Bradley Smith
4cc2f2b476 [SelectionDAG][AArch64][SVE] Perform SETCC condition legalization in LegalizeVectorOps
This is currently performed in SelectionDAGLegalize, here we make it also
happen in LegalizeVectorOps, allowing a target to lower the SETCC condition
codes first in LegalizeVectorOps and then lower to a custom node afterwards,
without having to duplicate all of the SETCC condition legalization in the
target specific lowering.

As a result of this, fixed length floating point SETCC nodes can now be
properly lowered for SVE.

Differential Revision: https://reviews.llvm.org/D98939
2021-03-29 15:32:25 +01:00
Cullen Rhodes
6682076a17 [IR] Introduce llvm.experimental.vector.splice intrinsic
This patch introduces a new intrinsic @llvm.experimental.vector.splice
that constructs a vector of the same type as the two input vectors,
based on a immediate where the sign of the immediate distinguishes two
variants. A positive immediate specifies an index into the first vector
and a negative immediate specifies the number of trailing elements to
extract from the first vector.

For example:

  @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E>  ; index
  @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing element count

These intrinsics support both fixed and scalable vectors, where the
former is lowered to a shufflevector to maintain existing behaviour,
although while marked as experimental the recommended way to express
this operation for fixed-width vectors is to use shufflevector. For
scalable vectors where it is not possible to express a shufflevector
mask for this operation, a new ISD node has been implemented.

This is one of the named shufflevector intrinsics proposed on the
mailing-list in the RFC at [1].

Patch by Paul Walker and Cullen Rhodes.

[1] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146864.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D94708
2021-03-09 10:44:22 +00:00
Craig Topper
787f70958a [TargetLowering] Use HandleSDNodes to prevent nodes from being deleted by recursive calls in getNegatedExpression.
For binary or ternary ops we call getNegatedExpression multiple
times and then compare costs. While we're doing this we need to
hold a node from the first call across the second call, but its
not yet attached to the DAG. Its possible the second call creates
an identical node and then decides it didn't need it so will try
to delete it if it has no uses. This can cause a reference to the
node we're holding further up the call stack to become invalidated.

To prevent this, we can use a HandleSDNode to artifically give
the node a use without connecting it to the DAG.

I've used a std::list of HandleSDNodes so we can create handles
only when we have a node to hold. HandleSDNode does not have
default constructor and cannot be copied or moved.

Fixes PR49393.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D97914
2021-03-04 22:48:25 -08:00
Simon Pilgrim
040a134965 [DAG] TargetLowering::BuildUDIV - use APInt as const ref. NFCI.
Fixes clang-tidy warning.
2021-03-04 12:15:08 +00:00
Simon Pilgrim
dc7318caba [DAG] expandAddSubSat - break if-else chain. NFCI.
Fix styleguide issue - each if() block always returns so we don't need to make them a if-else chain.
2021-02-26 11:02:08 +00:00
Kazu Hirata
38cc9ea5cc [CodeGen] Use range-based for loops (NFC) 2021-02-21 19:58:07 -08:00
Craig Topper
13979f4812 [SelectionDAG][AArch64] Restrict matchUnaryPredicate to only handle SPLAT_VECTOR for scalable vectors.
fde24661718c7812a20a10e518cd853e8e060107 added support for
scalable vectors to matchUnaryPredicate by handling SPLAT_VECTOR in
addition to BUILD_VECTOR. This was used to enabled UDIV/SDIV/UREM/SREM
by constant expansion in BuildUDIV/BuildSDIV in TargetLowering.cpp

The caller there expects to call getBuildVector from the match factors.
This leads to a crash right now if there is a SPLAT_VECTOR of
fixed vectors since the number of vectors won't match the number
of elements.

To fix this, this patch updates the callers to check the opcode
instead of whether the type is fixed or scalable. This assumes
that only 3 opcodes are handled by matchUnaryPredicate so
I've added an assertion to the final else to check that opcode.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D96174
2021-02-16 09:22:46 -08:00
Craig Topper
9bc03ac328 [RISCV][LegalizeTypes] Try to expand BITREVERSE before promoting if the promoted BITREVERSE would expand anyway.
If we're going to end up expanding anyway, we should do it early
so we don't create extra operations to handle the bytes added by
promotion.

Simlilar was done for BSWAP previously.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D96681
2021-02-15 12:33:16 -08:00
Simon Pilgrim
1123f44663 [DAG] Fix shift amount limit in SimplifyDemandedBits trunc(shift(x,c)) to truncated bitwidth
We lost this in D56387/rG69bc0990a9181e6eb86228276d2f59435a7fae67 - where I got the src/dst bitwidths mixed up and assumed getValidShiftAmountConstant would catch it.

Patch by @craig.topper - confirmed by @Carrot that it fixes PR49162
2021-02-13 12:00:08 +00:00
Craig Topper
32e1c9e6be [TargetLowering][RISCV][AArch64][PowerPC] Enable BuildUDIV/BuildSDIV on illegal types before type legalization if we can find a larger legal type that supports MUL.
If we wait until the type is legalized, we'll lose information
about the orginal type and need to use larger magic constants.
This gets especially bad on RISCV64 where i64 is the only legal
type.

I've limited this to simple scalar types so it only works for
i8/i16/i32 which are most likely to occur. For more odd types
we might want to do a small promotion to a type where MULH is legal
instead.

Unfortunately, this does prevent some urem/srem+seteq matching since
that still require legal types.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D96210
2021-02-11 09:43:13 -08:00
Craig Topper
b58d6a4b20 [TargetLowering] Use Align in allowsMisalignedMemoryAccesses.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96097
2021-02-04 19:22:06 -08:00
Craig Topper
7f5abbda56 [TargetLowering] Use LegalOnly operand to isOperationLegalOrCustom to simplify some code. NFC 2021-02-04 12:30:37 -08:00
xgupta
ddd3adad6f [Branch-Rename] Fix some links
According to the [[ https://foundation.llvm.org/docs/branch-rename/ | status of branch rename ]], the master branch of the LLVM repository is removed on 28 Jan 2021.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D95766
2021-02-01 16:43:21 +05:30
Craig Topper
6f831b56f3 [RISCV][LegalizeTypes] Try to expand BSWAP before promoting if the promoted BSWAP would expand anyway.
If we're going to end up expanding anyway, we should do it early
so we don't create extra operations to handle the bytes added by
promotion.

This is helfpul on RISCV where we might have to promote i16 all
the way to i64.

Differential Revision: https://reviews.llvm.org/D95756
2021-01-31 14:33:29 -08:00
Craig Topper
0571d7eb26 [TargetLowering][RISCV] Don't transform (seteq/ne (sext_inreg X, VT), C1) -> (seteq/ne (zext_inreg X, VT), C1) if the sext_inreg is cheaper
RISCV has to use 2 shifts for (i64 (zext_inreg X, i32)), but we
can use addiw rd, rs1, x0 for sext_inreg. We already understood this
when type legalizing i32 seteq/ne on rv64. But this transform in
SimplifySetCC would sometimes undo it.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D95289
2021-01-25 16:37:21 -08:00
Fraser Cormack
c88b1ceeef [SelectionDAG] Support scalable-vector splats in more cases
This patch adds support for scalable-vector splats in DAGCombiner's
`isConstantOrConstantVector` and `ISD::matchUnaryPredicate` functions,
which enable the SelectionDAG div/rem-by-constant optimizations for
scalable vector types.

It also fixes up one case where the UDIV optimization was generating a
SETCC without first consulting the target for its preferred SETCC result
type.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D94501
2021-01-25 10:58:15 +00:00
QingShan Zhang
d5b70bbb38 [NFC] [DAGCombine] Correct the result for sqrt even the iteration is zero
For now, we correct the result for sqrt if iteration > 0. This doesn't make
sense as they are not strict relative.

Reviewed By: dmgreen, spatel, RKSimon

Differential Revision: https://reviews.llvm.org/D94480
2021-01-25 04:02:44 +00:00
Craig Topper
4a8863f5e1 [TargetLowering] Use isOneConstant to simplify some code. NFC 2021-01-22 19:32:19 -08:00