Commit Graph

138683 Commits

Author SHA1 Message Date
alex-t
c025a25bff [AMDGPU] SILowerControlFlow::optimizeEndCF should remove empty basic block
optimizeEndCF removes EXEC restoring instruction case this instruction is the only one except the branch to the single successor and that successor contains EXEC mask restoring instruction that was lowered from END_CF belonging to IF_ELSE.
As a result of such optimization we get the basic block with the only one instruction that is a branch to the single successor.
In case the control flow can reach such an empty block from S_CBRANCH_EXEZ/EXECNZ it might happen that spill/reload instructions that were inserted later by register allocator are placed under exec == 0 condition and never execute.
Removing empty block solves the problem.

This change require further work to re-implement LIS updates. Recently, LIS is always nullptr in this pass. To enable it we need another patch to fix many places across the codegen.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D86634
2020-09-07 19:37:27 +03:00
Momchil Velikov
8e2bba4fdc Reduce the number of memory allocations when displaying
a warning about clobbering reserved registers (NFC).

Also address some minor inefficiencies and style issues.

Differential Revision: https://reviews.llvm.org/D86088
2020-09-07 17:04:00 +01:00
Simon Pilgrim
9b5af79762 [X86][SSE] Don't use LowerVSETCCWithSUBUS for unsigned compare with +ve operands (PR47448)
We already simplify the unsigned comparisons if we've found the operands are non-negative, but we were still calling LowerVSETCCWithSUBUS which resulted in the PR47448 regressions.
2020-09-07 16:11:40 +01:00
Simon Pilgrim
8f66f4b7f0 [X86] Replace UpgradeX86AddSubSatIntrinsics with UpgradeX86BinaryIntrinsics generic helper. NFCI.
Feed the Intrinsic::ID value directly instead of via the IsSigned/IsAddition bool flags.
2020-09-07 15:57:18 +01:00
Sanjay Patel
7f2a42d49c [InstCombine] erase instructions leading up to unreachable
Normal dead code elimination ignores assume intrinsics, so we fail to
delete assumes that are not meaningful (and potentially worse if they
cause conflicts with other assumptions).

The motivating example in https://llvm.org/PR47416 suggests that we
might have problems upstream from here (difference between C and C++),
but this should be a cheap way to make sure we remove more dead code.

Differential Revision: https://reviews.llvm.org/D87149
2020-09-07 10:44:08 -04:00
Simon Pilgrim
e1f9ca6993 [X86] Auto upgrade SSE/AVX PABS intrinsics to generic Intrinsic::abs
Minor followup to D87101, we were expanding this to a neg+icmp+select pattern like we were in CGBuiltin
2020-09-07 15:07:26 +01:00
Simon Pilgrim
9a57b8dde1 MachineStableHash.h - remove MachineInstr.h include. NFC.
Use forward declarations and move the include to MachineStableHash.cpp
2020-09-07 13:33:48 +01:00
Simon Wallis
63f9b9c858 [SelectionDAG] memcpy expansion of const volatile struct ignores const zero
In getMemcpyLoadsAndStores(), a memcpy where the source is a zero constant is expanded to a MemOp::Set instead of a MemOp::Copy, even when the memcpy is volatile.
This is incorrect.

The fix is to add a check for volatile, and expand to MemOp::Copy in the volatile case.

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D87134
2020-09-07 13:22:09 +01:00
Sanjay Patel
9e71816984 [InstCombine] give a name to an intermediate value for easier tracking; NFC
As noted in PR47430, we probably want to conditionally include 'nsw'
here anyway, so we are going to need to fill out the optional args.
2020-09-07 08:19:42 -04:00
Simon Pilgrim
60f05bdb77 LegalizeTypes.h - remove orphan SplitVSETCC declaration. NFCI.
The implementation no longer exists
2020-09-07 13:11:49 +01:00
Simon Pilgrim
c25887b4c4 X86AvoidStoreForwardingBlocks.cpp - use unsigned for Opcode values. NFCI.
Fixes clang-tidy cppcoreguidelines-narrowing-conversions warnings.
2020-09-07 12:56:27 +01:00
Simon Pilgrim
950a62206e [X86][AVX] Use lowerShuffleWithPERMV in shuffle combining to support non-VLX targets
lowerShuffleWithPERMV allows us to use the ZMM variants for 128/256-bit variable shuffles on non-VLX AVX512 targets.

This is another step towards shuffle combining through between vector widths - we still end up with an annoying regression (combine_vpermilvar_vperm2f128_zero_8f32) but we're going in the right direction....
2020-09-07 12:50:50 +01:00
Sam Parker
70e9782c55 [SCEV] Refactor isHighCostExpansionHelper
To enable the cost of constants, the helper function has been
reorganised:
- A struct has been introduced to hold SCEV operand information so
  that we know the user of the operand, as well as the operand index.
  The Worklist now uses instead instead of a bare SCEV.
- The costing of each SCEV, and collection of its operands, is now
  performed in a helper function.

Differential Revision: https://reviews.llvm.org/D86050
2020-09-07 11:57:46 +01:00
Benjamin Kramer
f3d58580a1 [X86] Unbreak the build after 22fa6b20d92e 2020-09-07 12:24:30 +02:00
Simon Pilgrim
1db7a4fa9e [X86] getFauxShuffleMask - handle insert_subvector(zero, sub, C)
Directly use SM_SentinelZero elements if we're (widening)inserting into a zero vector.
2020-09-07 11:10:40 +01:00
Simon Pilgrim
4fee922db4 [X86] Use Register instead of unsigned. NFCI.
Fixes llvm-prefer-register-over-unsigned clang-tidy warnings.
2020-09-07 10:49:29 +01:00
Simon Pilgrim
bd029710f2 [X86] Use Register instead of unsigned. NFCI.
Fixes llvm-prefer-register-over-unsigned clang-tidy warnings.
2020-09-07 10:38:09 +01:00
Simon Pilgrim
c1e4cc6249 [X86] Use Register instead of unsigned. NFCI.
Fixes llvm-prefer-register-over-unsigned clang-tidy warning.
2020-09-07 10:38:08 +01:00
Sam Parker
7c4a7cb063 [SimplifyCFG] Consider cost of combining predicates.
Modify FoldBranchToCommonDest to consider the cost of inserting
instructions when attempting to combine predicates to fold blocks.
The threshold can be controlled via a new option:
-simplifycfg-branch-fold-threshold which defaults to '2' to allow
the insertion of a not and another logical operator.

Differential Revision: https://reviews.llvm.org/D86526
2020-09-07 10:04:50 +01:00
Jay Foad
a04922a28f [GlobalISel] Extend not_cmp_fold to work on conditional expressions
Differential Revision: https://reviews.llvm.org/D86709
2020-09-07 09:31:08 +01:00
Sam Parker
c37b434c46 [ARM][CostModel] CodeSize costs for i1 arith ops
When optimising for size, make the cost of i1 logical operations
relatively expensive so that optimisations don't try to combine
predicates.

Differential Revision: https://reviews.llvm.org/D86525
2020-09-07 09:27:18 +01:00
Xing GUO
5f500e98a4 [DWARFYAML] Make the debug_addr section optional.
This patch makes the debug_addr section optional. When an empty
debug_addr section is specified, yaml2obj only emits a section header
for it.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D87205
2020-09-07 16:17:18 +08:00
Jay Foad
3381408aad [KnownBits] Implement accurate unsigned and signed max and min
Use the new implementation in ValueTracking, SelectionDAG and
GlobalISel.

Differential Revision: https://reviews.llvm.org/D87034
2020-09-07 09:09:01 +01:00
dongAxis
f58ab2a818 When dumping results of StackLifetime, it will print the following
log:

BB  [7, 8): begin {}, end {}, livein {}, liveout {}
BB  [1, 2): begin {}, end {}, livein {}, liveout {}
...

But it is not convenient to know what the basic block is.
So I add the basic block name to it.

Reviewed By: vitalybuka
TestPlan: check-llvm
Differential Revision: https://reviews.llvm.org/D87152
2020-09-07 11:43:16 +08:00
Zi Xuan Wu
a51dedaf25 [ELF] Add a new e_machine value EM_CSKY and add some CSKY relocation types
This is the split part of D86269, which add a new ELF machine flag called EM_CSKY and related relocations.
Some target-specific flags and tests for csky can be added in follow-up patches later.

Differential Revision: https://reviews.llvm.org/D86610
2020-09-07 10:42:28 +08:00
Thomas Lively
08861e0346 [WebAssembly] Fix incorrect assumption of simple value types
Fixes PR47375, in which an assertion was triggering because
WebAssemblyTargetLowering::isVectorLoadExtDesirable was improperly
assuming the use of simple value types.

Differential Revision: https://reviews.llvm.org/D87110
2020-09-06 15:42:21 -07:00
Amy Kwan
2477050bd8 [PowerPC] Implement Vector Expand Mask builtins in LLVM/Clang
This patch implements the vec_expandm function prototypes in altivec.h in order
to utilize the vector expand with mask instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82727
2020-09-06 17:13:21 -05:00
Nikita Popov
99a542ef5d [ValueTracking] Avoid known bits fallback for non-zero get check (NFCI)
The known bits fall back will never be able to infer a non-null
value here, so don't bother.
2020-09-06 23:16:38 +02:00
Florian Hahn
18bc223e14 [DSE,MemorySSA] Add a few additional debug messages. 2020-09-06 20:31:00 +01:00
Benjamin Kramer
c3337b0542 [SmallVector] Move error handling out of line
This reduces duplication and avoids emitting ice cold code into every
instance of grow().
2020-09-06 18:06:44 +02:00
Simon Pilgrim
ea26a20118 [X86][AVX] lowerShuffleWithPERMV - adjust binary shuffle masks to account for widening on non-VLX targets
rGabd33bf5eff2 enabled us to pad 128/256-bit shuffles to 512-bit on non-VLX targets, but wasn't updating binary shuffles to account for the new vector width.
2020-09-06 14:52:25 +01:00
Nikita Popov
587a99fa86 [InstSimplify] Fold degenerate abs of abs form
This addresses the remaining issue from D87188. Due to a series of
folds, we may end up with abs-of-abs represented as
x == 0 ? -abs(x) : abs(x). Rather than recognizing this as a special
abs pattern and doing an abs-of-abs fold on it afterwards,
I'm directly folding this to one of the select operands in InstSimplify.

The general pattern falls into the "select with operand replaced"
category, but that fold is not powerful enough to recognize that
both hands of the select are the same for value zero.

Differential Revision: https://reviews.llvm.org/D87197
2020-09-06 09:43:08 +02:00
Amara Emerson
eb6cc475b7 [GlobalISel] Disable the indexed loads combine completely unless forced. NFC.
The post-index matcher, before it queries the target legality, walks uses
of some instructions which in pathological cases can be massive. Since
no targets actually support indexed loads yet, disable this to stop wasting
compile time on something which is going to fail anyway.
2020-09-05 21:04:03 -07:00
vnalamot
62e3b4669e [AMDGPU] Remove the dead spill slots while spilling FP/BP to memory
During the PEI pass, the dead TargetStackID::SGPRSpill spill slots
are not being removed while spilling the FP/BP to memory.

Fixes: SWDEV-250393

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D87032
2020-09-06 07:04:25 +05:30
Krzysztof Parzyszek
cb324edefe [Hexagon] Add assertions about V6_pred_scalar2 2020-09-05 18:20:23 -05:00
Krzysztof Parzyszek
34a99d5a26 [Hexagon] When widening truncate result, also widen operand if necessary 2020-09-05 18:19:32 -05:00
Krzysztof Parzyszek
c8b94c00f4 [Hexagon] Resize the mem operand when widening loads and stores 2020-09-05 18:17:48 -05:00
Krzysztof Parzyszek
00ccf90843 [Hexagon] Handle widening of vector truncate 2020-09-05 15:07:38 -05:00
Florian Hahn
6276196b86 [LangRef] Adjust guarantee for llvm.memcpy to also allow equal arguments.
This adjusts the description of `llvm.memcpy` to also allow operands
to be equal. This is in line with what Clang currently expects.

This change is intended to be temporary and followed by re-introduce
a variant with the non-overlapping guarantee for cases where we can
actually ensure that property in the front-end.

See the links below for more details:
http://lists.llvm.org/pipermail/cfe-dev/2020-August/066614.html
and PR11763.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D86815
2020-09-05 19:18:23 +01:00
Nikita Popov
50631fa23a [SCEV] Recognize min/max intrinsics
Recognize umin/umax/smin/smax intrinsics and convert them to the
already existing SCEV nodes of the same name.

In the future we'll want SCEVExpander to also produce the intrinsics,
but we're not ready for that yet.

Differential Revision: https://reviews.llvm.org/D87160
2020-09-05 16:30:11 +02:00
Nikita Popov
8436150b08 [InstCombine] Fold abs with dominating condition
Similar to D87168, but for abs. If we have a dominating x >= 0
condition, then we know that abs(x) is x. This fold is in
InstCombine, because we need to create a sub instruction for
the x < 0 case.

Differential Revision: https://reviews.llvm.org/D87184
2020-09-05 16:18:35 +02:00
Nikita Popov
1eb2ecc5c6 [InstSimplify] Fold min/max based on dominating condition
If we have a dominating condition that x >= y, then umax(x, y) is x,
etc. I'm doing this in InstSimplify as the corresponding transform
for the select form is also done there.

Differential Revision: https://reviews.llvm.org/D87168
2020-09-05 16:16:40 +02:00
Nikita Popov
ecd979f4bb [InstCombine] Fold abs intrinsic eq zero
Following the same transform for the select version of abs.
2020-09-05 15:11:38 +02:00
Nikita Popov
fa01287458 [InstCombine] Fold mul of abs intrinsic
Same as the existing SPF_ABS fold. We don't need to explicitly
handle NABS, as the negs will get folded away first.
2020-09-05 12:37:45 +02:00
Nikita Popov
5750a482e4 [InstCombine] Fold cttz of abs intrinsic
Same as the existing fold for SPF_ABS. We don't need to explicitly
handle the NABS variant, as we'll first fold away the neg in that
case.
2020-09-05 12:25:41 +02:00
Jonas Paulsson
7897525197 [SelectionDAG] Always intersect SDNode flags during getNode() node memoization.
Previously SDNodeFlags::instersectWith(Flags) would do nothing if Flags was
in an undefined state, which is very bad given that this is the default when
getNode() is called without passing an explicit SDNodeFlags argument.

This meant that if an already existing and reused node had a flag which the
second caller to getNode() did not set, that flag would remain uncleared.

This was exposed by https://bugs.llvm.org/show_bug.cgi?id=47092, where an NSW
flag was incorrectly set on an add instruction (which did in fact overflow in
one of the two original contexts), so when SystemZElimCompare removed the
compare with 0 trusting that flag, wrong-code resulted.

There is more that needs to be done in this area as discussed here:

Differential Revision: https://reviews.llvm.org/D86871

Review: Ulrich Weigand, Sanjay Patel
2020-09-05 10:30:38 +02:00
serge-sans-paille
32b636840a Fix return status of SimplifyCFG
When a switch case is folded into default's case, that's an IR change that
should be reported, update ConstantFoldTerminator accordingly.

Differential Revision: https://reviews.llvm.org/D87142
2020-09-05 07:54:15 +02:00
Qiu Chaofan
6da3508c40 [PowerPC] Expand constrained ppc_fp128 to i32 conversion
Libcall __gcc_qtou is not available, which breaks some tests needing
it. On PowerPC, we have code to manually expand the operation, this
patch applies it to constrained conversion. To keep it strict-safe,
it's using the algorithm similar to expandFP_TO_UINT.

For constrained operations marking FP exception behavior as 'ignore',
we should set the NoFPExcept flag. However, in some custom lowering
the flag is missed. This should be fixed by future patches.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D86605
2020-09-05 13:16:20 +08:00
Krzysztof Parzyszek
af4c0a6b56 [Hexagon] Unindent everything in HexagonISelLowering.h, NFC
Just a shift, no other formatting changes.
2020-09-04 17:25:29 -05:00
Craig Topper
45fc52d9c3 [X86] Prevent shuffle combining from creating an identical X86ISD::SHUF128.
This can cause an infinite loop if SimplifiedDemandedElts asks
for the node to replace itself.

A similar protection exists in other places in shuffle combining.

Fixes ISPC https://github.com/ispc/ispc/issues/1864
2020-09-04 14:12:49 -07:00