26641 Commits

Author SHA1 Message Date
Sam Parker
6189e4972d [HardwareLoops] Loop counter guard intrinsic
Introduce llvm.test.set.loop.iterations which sets the loop counter
and also produces an i1 after testing that the count is not zero.

Differential Revision: https://reviews.llvm.org/D63809


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364628 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-28 07:38:16 +00:00
Matt Arsenault
cc1390c334 GlobalISel: Use Register
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364618 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-28 01:47:44 +00:00
Matt Arsenault
5564ced645 GlobalISel: Convert rest of MachineIRBuilder to using Register
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364615 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-28 01:16:41 +00:00
Amara Emerson
b9f4ed6752 [GlobalISel][IRTranslator] Fix some PHI bugs related to jump tables when optimizations are used.
The new switch lowering code that tries to generate jump tables and range checks
were tested at -O0 on arm64, but on -O3 the generic switch lowering code goes to
town on trying to generate optimized lowerings, e.g. multiple jump tables, range
checks etc. This exposed bugs in the way PHI nodes are handled because the CFG
looks even stranger after all of this is done.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364613 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 23:56:34 +00:00
Rumeet Dhindsa
cf66538c75 Fix ASAN error caused by commit r364512.
This patch intends to fix ASAN stack-use-after-scope error.
This is at least a short-term fix to unbreak LLVM's mainline.

Differential Revision: https://reviews.llvm.org/D63905

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364611 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 23:37:04 +00:00
Roman Lebedev
65afd3a0fd [CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 3)
Summary:
I'm submitting a new revision since i don't understand how to reclaim/reopen/take over the existing one, D50222.
There is no such action in "Add Action" menu...

This implements an optimization described in Hacker's Delight 10-17: when `C` is constant,
the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder.
The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479.

This is a recommit, the original commit rL364563 was reverted in rL364568
because test-suite detected miscompile - the new comparison constant 'Q'
was being computed incorrectly (we divided by `D0` instead of `D`).

Original patch D50222 by @hermord (Dmytro Shynkevych)

Notes:
- In principle, it's possible to also handle the `X % C1 == C2` case, as discussed on bugzilla.
  This seems to require an extra branch on overflow, so I refrained from implementing this for now.
- An explicit check for when the `REM` can be reduced to just its LHS is included:
  the `X % C` == 0 optimization breaks `test1` in `test/CodeGen/X86/jump_sign.ll` otherwise.
  I hadn't managed to find a better way to not generate worse output in this case.
- The `test/CodeGen/X86/jump_sign.ll` regresses, and is being fixed by a followup patch D63390.

Reviewers: RKSimon, craig.topper, spatel, hermord, xbolva00

Reviewed By: RKSimon, xbolva00

Subscribers: dexonsmith, kristina, xbolva00, javed.absar, llvm-commits, hermord

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63391

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364600 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 21:52:10 +00:00
Djordje Todorovic
6998f5cb03 Revert "[LiveDebugValues] Emit the debug entry values"
Appears that the 'test/DebugInfo/MIR/X86/dbginfo-entryvals.mir'
does not pass on Windows.

This reverts commit rL364553.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364571 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 18:12:04 +00:00
Roman Lebedev
1df452bb5a Revert "[CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 2)"
*Appears* to break test-suite on
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/23790

FAIL: burg.execution_time
FAIL: spiff.execution_time
FAIL: employ.execution_time
FAIL: llu.execution_time
FAIL: gramschmidt.execution_time
FAIL: fdtd-apml.execution_time

This reverts commit r364563.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364568 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 17:22:31 +00:00
Roman Lebedev
84139109d4 [CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 2)
Summary:
I'm submitting a new revision since i don't understand how to reclaim/reopen/take over the existing one, D50222.
There is no such action in "Add Action" menu...
Original patch D50222 by @hermord (Dmytro Shynkevych)

This implements an optimization described in Hacker's Delight 10-17: when `C` is constant,
the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder.
The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479.

Original patch author: @hermord (Dmytro Shynkevych)!

Notes:
- In principle, it's possible to also handle the `X % C1 == C2` case, as discussed on bugzilla.
  This seems to require an extra branch on overflow, so I refrained from implementing this for now.
- An explicit check for when the `REM` can be reduced to just its LHS is included:
  the `X % C` == 0 optimization breaks `test1` in `test/CodeGen/X86/jump_sign.ll` otherwise.
  I hadn't managed to find a better way to not generate worse output in this case.
- The `test/CodeGen/X86/jump_sign.ll` regresses, and is being fixed by a followup patch D63390.

Reviewers: RKSimon, craig.topper, spatel, hermord, xbolva00

Reviewed By: RKSimon, xbolva00

Subscribers: xbolva00, javed.absar, llvm-commits, hermord

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63391

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364563 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 16:45:42 +00:00
Djordje Todorovic
068d4d9024 [LiveDebugValues] Emit the debug entry values
Emit replacements for clobbered parameters location if the parameter
has unmodified value throughout the funciton. This is basic scenario
where we can use the debug entry values.

([12/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>

Differential Revision: https://reviews.llvm.org/D58042

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364553 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 15:35:48 +00:00
Djordje Todorovic
7418f3f6fa [LiveRangeEdit] Fix build failure caused by the rL364536
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364549 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 14:31:52 +00:00
Simon Pilgrim
39f0e3cf18 [TargetLowering] SimplifyDemandedVectorElts - add shift/rotate support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364548 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 14:25:54 +00:00
Djordje Todorovic
192906bcfd [DWARF] Handle the DW_OP_entry_value operand
Add the IR and the AsmPrinter parts for handling of the DW_OP_entry_values
DWARF operation.

([11/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>

Differential Revision: https://reviews.llvm.org/D60866

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364542 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 13:52:34 +00:00
Simon Pilgrim
c2a0046960 [TargetLowering] SimplifyDemandedBits - use DemandedElts to better identify partial splat shift amounts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364541 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 13:48:43 +00:00
Djordje Todorovic
856ec6e1b0 [Backend] Keep call site info valid through the backend
Handle call instruction replacements and deletions in order to preserve
valid state of the call site info of the MachineFunction.

NOTE: If the call site info is enabled for a new target, the assertion from
the MachineFunction::DeleteMachineInstr() should help to locate places
where the updateCallSiteInfo() should be called in order to preserve valid
state of the call site info.

([10/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>

Differential Revision: https://reviews.llvm.org/D61062

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364536 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 13:10:29 +00:00
Djordje Todorovic
e02e5c6f21 [ISEL][X86] Tracking of registers that forward call arguments
While lowering calls, collect info about registers that forward arguments
into following function frame. We store such info into the MachineFunction
of the call. This is used very late when dumping DWARF info about
call site parameters.

([9/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>

Differential Revision: https://reviews.llvm.org/D60715

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364516 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 10:51:15 +00:00
Jeremy Morse
4a65564b26 [DebugInfo] Avoid register coalesing unsoundly changing DBG_VALUE locations
Once MIR code leaves SSA form and the liveness of a vreg is considered,
DBG_VALUE insts are able to refer to non-live vregs, because their
debug-uses do not contribute to liveness. This non-liveness becomes
problematic for optimizations like register coalescing, as they can't
``see'' the debug uses in the liveness analyses.

As a result registers get coalesced regardless of debug uses, and that can
lead to invalid variable locations containing unexpected values. In the
added test case, the first vreg operand of ADD32rr is merged with various
copies of the vreg (great for performance), but a DBG_VALUE of the
unmodified operand is blindly updated to the modified operand. This changes
what value the variable will appear to have in a debugger.

Fix this by changing any DBG_VALUE whose operand will be resurrected by
register coalescing to be a $noreg DBG_VALUE, i.e. give the variable no
location. This is an overapproximation as some coalesced locations are
safe (others are not) -- an extra domination analysis would be required to
work out which, and it would be better if we just don't generate non-live
DBG_VALUEs.

This fixes PR40010.

Differential Revision: https://reviews.llvm.org/D56151


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364515 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 10:20:27 +00:00
Diana Picus
419fa631cd [GlobalISel] Remove [un]packRegs from IRTranslator
Remove the last use of packRegs from IRTranslator and delete
pack/unpackRegs. This introduces a fallback to DAGISel for intrinsics
with aggregate arguments, since we don't have a testcase for them so
it's hard to tell how we'd want to handle them.

Discussed in https://reviews.llvm.org/D63551

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364514 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 09:49:07 +00:00
Diana Picus
65c682330e [GlobalISel] Accept multiple vregs for lowerCall's args
Change the interface of CallLowering::lowerCall to accept several
virtual registers for each argument, instead of just one.  This is a
follow-up to D46018.

CallLowering::lowerReturn was similarly refactored in D49660 and
lowerFormalArguments in D63549.

With this change, we no longer pack the virtual registers generated for
aggregates into one big lump before delegating to the target. Therefore,
the target can decide itself whether it wants to handle them as separate
pieces or use one big register.

ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.

NFCI for AMDGPU, Mips and X86.

Differential Revision: https://reviews.llvm.org/D63551

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364512 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 09:18:03 +00:00
Diana Picus
d3b26382a9 [GlobalISel] Accept multiple vregs for lowerCall's result
Change the interface of CallLowering::lowerCall to accept several
virtual registers for the call result, instead of just one.  This is a
follow-up to D46018.

CallLowering::lowerReturn was similarly refactored in D49660 and
lowerFormalArguments in D63549.

With this change, we no longer pack the virtual registers generated for
aggregates into one big lump before delegating to the target. Therefore,
the target can decide itself whether it wants to handle them as separate
pieces or use one big register.

ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.

NFCI for AMDGPU, Mips and X86.

Differential Revision: https://reviews.llvm.org/D63550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364511 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 09:15:53 +00:00
Diana Picus
4776c1ff97 [GlobalISel] Accept multiple vregs in lowerFormalArgs
Change the interface of CallLowering::lowerFormalArguments to accept
several virtual registers for each formal argument, instead of just one.
This is a follow-up to D46018.

CallLowering::lowerReturn was similarly refactored in D49660. lowerCall
will be refactored in the same way in follow-up patches.

With this change, we forward the virtual registers generated for
aggregates to CallLowering. Therefore, the target can decide itself
whether it wants to handle them as separate pieces or use one big
register. We also copy the pack/unpackRegs helpers to CallLowering to
facilitate this.

ARM and AArch64 have been updated to use the passed in virtual registers
directly, which means we no longer need to generate so many
merge/extract instructions.

AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was
put into a s64 instead of a p0. Added a test-case which illustrates the
problem more clearly (it crashes without this patch) and fixed the
existing test-case to expect p0.

AMDGPU has been updated to unpack into the virtual registers for
kernels. I think the other code paths fall back for aggregates, so this
should be NFC.

Mips doesn't support aggregates yet, so it's also NFC.

x86 seems to have code for dealing with aggregates, but I couldn't find
the tests for it, so I just added a fallback to DAGISel if we get more
than one virtual register for an argument.

Differential Revision: https://reviews.llvm.org/D63549

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364510 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 08:54:17 +00:00
Diana Picus
43a431faf5 [GlobalISel] Allow multiple VRegs in ArgInfo. NFC
Allow CallLowering::ArgInfo to contain more than one virtual register.
This is useful when passes split aggregates into several virtual
registers, but need to also provide information about the original type
to the call lowering. Used in follow-up patches.

Differential Revision: https://reviews.llvm.org/D63548

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364509 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 08:50:53 +00:00
Djordje Todorovic
3aa859711c [MachineFunction] Base support for call site info tracking
Add an attribute into the MachineFunction that tracks call site info.

([8/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>

Differential Revision: https://reviews.llvm.org/D61061

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364506 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 07:48:06 +00:00
Djordje Todorovic
0bbea12560 [IR] Add DISuprogram and DIE for a func decl
A unique DISubprogram may be attached to a function declaration used for
call site debug info.

([6/13] Introduce the debug entry values.)

Co-authored-by: Ananth Sowda <asowda@cisco.com>
Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com>
Co-authored-by: Ivan Baev <ibaev@cisco.com>

Differential Revision: https://reviews.llvm.org/D60713

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364500 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-27 06:07:41 +00:00
Matt Arsenault
5d11007246 PEI: Add default handling of spills to registers
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364472 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-26 20:56:15 +00:00
Evandro Menezes
6016bbfa46 [CodeGen] Improve formatting of jump tables (NFC)
Split jump tables into individual lines and fix spacing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364436 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-26 15:11:31 +00:00
Roman Lebedev
a698111256 [X86] X86DAGToDAGISel::matchBitExtract(): pattern b: truncation awareness
Summary:
(Not so) boringly identical to pattern a (D62786)
Not yet sure how do deal with the last pattern c.

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62793

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364418 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-26 12:19:39 +00:00
Clement Courbet
6f6d98e186 Revert "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline."
Breaks sanitizers:
    libFuzzer :: cxxstring.test
    libFuzzer :: memcmp.test
    libFuzzer :: recommended-dictionary.test
    libFuzzer :: strcmp.test
    libFuzzer :: value-profile-mem.test
    libFuzzer :: value-profile-strcmp.test

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364416 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-26 12:13:13 +00:00
Chen Zheng
3a6c5d72b9 [HardwareLoops] NFC - move loop with irreducible control flow checking logic to HarewareLoopInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364415 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-26 12:02:43 +00:00
Clement Courbet
e0fc543f4c [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.
This allows later passes (in particular InstCombine) to optimize more
cases.

One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364412 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-26 11:50:18 +00:00
Simon Pilgrim
eb39630694 [DAGCombine] visitEXTRACT_SUBVECTOR - add TODO for extract_subvector(bitcast()) support
We support 'big to little' (e.g. extract_subvector(v16i8 bitcast(v2i64))) but not 'little to big' cases  (e.g. extract_subvector(v2i64 bitcast(v16i8)))


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364405 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-26 11:17:38 +00:00
Chen Zheng
5bd39d6f6e [HardwareLoops] NFC - move loop with irreducible control flow checking logic to isHardwareLoopProfitable()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364397 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-26 09:12:52 +00:00
QingShan Zhang
2d7ed2481d Teach the DAGCombine to fold this pattern(c1 and c2 is constant).
// fold (sext (select cond, c1, c2)) -> (select cond, sext c1, sext c2)
// fold (zext (select cond, c1, c2)) -> (select cond, zext c1, zext c2)
// fold (aext (select cond, c1, c2)) -> (select cond, sext c1, sext c2)
Sign extend the operands if it is any_extend, to keep the signess of the operands that, the other combine rule would apply. The any_extend is handled as zero extend for constants. i.e.

t1: i8 = select t0, Constant:i8<-1>, Constant:i8<0>
t2: i64 = any_extend t1
 -->
t3: i64 = select t0, Constant:i64<-1>, Constant:i64<0>
 -->
t4: i64 = sign_extend_inreg t3

Differential Revision: https://reviews.llvm.org/D63318



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364382 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-26 05:12:53 +00:00
Jinsong Ji
4d1e6104e6 [MachinePipeliner] Fix risky iterator usage R++, --R
When we calculate MII, we use two loops, one with iterator R++ to
check whether we can reserve the resource, then --R to move back
the iterator to do reservation.

This is risky, as R++, --R may not point to the same element at all.
The can cause wrong MII.

Differential Revision: https://reviews.llvm.org/D63536

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364353 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 21:50:56 +00:00
Philip Reames
7e88372346 [Peephole] Allow folding loads into instructions w/multiple uses (such as test64rr)
Peephole opt has a one use limitation which appears to be accidental. The function being used was incorrectly documented as returning whether the def had one *user*, but instead returned true only when there was one *use*. Add a corresponding hasOneNonDbgUser helper, and adjust peephole-opt to use the appropriate one.

All of the actual folding code handles multiple uses within a single instruction. That codepath is well exercised through instruction selection.

Differential Revision: https://reviews.llvm.org/D63656



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364336 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 17:29:18 +00:00
Simon Pilgrim
d3047099fd [DAGCombine] combineRepeatedFPDivisors - recognize -1.0 / X as a reciprocal
Fixes issue identified by @nemanjai (Nemanja Ivanovic) in D62963 / rL363040 - infinite loop due to GetNegatedExpression fighting combineRepeatedFPDivisors resulting in fneg(fdiv(x,splat)) -> fneg(fmul(x,1.0/splat)) -> fmul(x,-1.0/splat) -> fmul(x,(-1.0 * 1.0)/splat) ......

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364326 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 16:00:16 +00:00
Sanjay Patel
4ad4edc530 [SDAG] expand ctpop != 1
Change the generic ctpop expansion to more efficiently handle a
check for not-a-power-of-two value:
(ctpop x) != 1 --> (x == 0) || ((x & x-1) != 0)

This is the inverted predicate sibling pattern that was added with:
D63004

This should have been done before I changed IR canonicalization to
favor this form with:
rL364246
...so if this requires revert/changing, the earlier commit may also
need to modified.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364319 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 14:46:52 +00:00
Simon Pilgrim
836545be6c [TargetLowering] SimplifyDemandedBits - add ANY_EXTEND_VECTOR_INREG support
Add 'lowest' demanded elt -> bitcast fold to all *_EXTEND_VECTOR_INREG cases.

Reapplies rL363856.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364311 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 13:25:57 +00:00
Simon Pilgrim
b266d9ca67 [TargetLowering] SimplifyDemandedBits ZERO_EXTEND_VECTOR_INREG -> ANY_EXTEND_VECTOR_INREG
Simplify ZERO_EXTEND_VECTOR_INREG if the extended bits are not required.

Matches what we already do for ZERO_EXTEND.

Reapplies rL363850 but now with legality checks added at rL364290

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364303 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 12:57:43 +00:00
Sanjay Patel
324d8e11c7 [SDAG] improve expansion of ctpop+setcc
This should not cause any visible change in output, but it's
more efficient because we were producing non-canonical 'sub x, 1'
and 'setcc ugt x, 0'. As mentioned in the TODO, we should also
be handling the inverse predicate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364302 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 12:49:35 +00:00
Simon Pilgrim
07e7622e7c [TargetLowering] SimplifyDemandedBits SIGN_EXTEND_VECTOR_INREG -> ANY/ZERO_EXTEND_VECTOR_INREG
Simplify SIGN_EXTEND_VECTOR_INREG if the extended bits are not required/known zero.

Matches what we already do for SIGN_EXTEND.

Reapplies rL363802 but now with legality checks added at rL364290

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364299 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 12:19:12 +00:00
Simon Pilgrim
6fb3129d4f [VectorLegalizer] ExpandANY_EXTEND_VECTOR_INREG/ExpandZERO_EXTEND_VECTOR_INREG - widen source vector
The *_EXTEND_VECTOR_INREG opcodes were relaxed back around rL346784 to support source vector widths that are smaller than the output - it looks like the legalizers were never updated to account for this.

This patch inserts the smaller source vector into an undef vector of the same width of the result before performing the shuffle+bitcast to correctly handle this.

Part of the yak shaving to solve the crashes from rL364264 and rL364272

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364295 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 11:31:37 +00:00
Simon Pilgrim
691701d2f8 [TargetLowering] SimplifyDemandedBits - legal checks for SIGN/ZERO_EXTEND -> ZERO/ANY_EXTEND
As part of the fix for rL364264 + rL364272 - limit the *_EXTEND conversion to !TLO.LegalOperations || isOperationLegal cases.

We'll improve X86 legality in future commits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364290 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 10:51:15 +00:00
Roman Lebedev
ded5cadad1 [Codegen] TargetLowering::SimplifySetCC(): omit urem when possible
Summary:
This addresses the regression that is being exposed by D50222 in `test/CodeGen/X86/jump_sign.ll`
The missing fold, at least partially, looks trivial:
https://rise4fun.com/Alive/Zsln
i.e. if we are comparing with zero, and comparing the `urem`-by-non-power-of-two,
and the `urem` is of something that may at most have a single bit set (or no bits set at all),
the `urem` is not needed.

Reviewers: RKSimon, craig.topper, xbolva00, spatel

Reviewed By: xbolva00, spatel

Subscribers: xbolva00, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63390

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364286 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 10:01:42 +00:00
Clement Courbet
6ef46a770d [ExpandMemCmp] Move all options to TargetTransformInfo.
Split off from D60318.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364281 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 08:04:13 +00:00
Craig Topper
a488f0fa72 Revert r363802, r363850, and r363856 "[TargetLowering] SimplifyDemandedBits..."
This reverts the following patches.
"[TargetLowering] SimplifyDemandedBits SIGN_EXTEND_VECTOR_INREG -> ANY/ZERO_EXTEND_VECTOR_INREG"
"[TargetLowering] SimplifyDemandedBits ZERO_EXTEND_VECTOR_INREG -> ANY_EXTEND_VECTOR_INREG"
"[TargetLowering] SimplifyDemandedBits - add ANY_EXTEND_VECTOR_INREG support"

We can end up with an any_extend_vector_inreg with a 256 bit result type
and a 128 bit result type. This is allowed by the ISD opcode, but the
generic operation legalizer is only able to expand cases where the
total vector width is the same.

The X86 backend creates these mismatched cases for zext_vec_inreg/sext_vec_inreg.
The SimplifyDemandedBits changes are allowing those nodes to become
aext_vec_inreg. For the zext/sext cases, the X86 backend has Custom
handling and never lets them get to the generic legalizer. We need to do the same
for aext_vec_inreg.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364264 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 01:32:42 +00:00
Roland Froese
12d2dd1fe1 [CodeGen] Add missing vector type legalization for ctlz_zero_undef
Widen vector result type for ctlz_zero_undef and cttz_zero_undef the same as
ctlz and cttz.

Differential Revision: https://reviews.llvm.org/D63463






git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364221 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-24 19:27:07 +00:00
Matt Arsenault
a3af6bb71d GlobalISel: Remove unsigned variant of SrcOp
Force using Register.

One downside is the generated register enums require explicit
conversion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364194 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-24 16:16:12 +00:00
Matt Arsenault
a2b05bc24d CodeGen: Introduce a class for registers
Avoids using a plain unsigned for registers throughoug codegen.
Doesn't attempt to change every register use, just something a little
more than the set needed to build after changing the return type of
MachineOperand::getReg().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364191 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-24 15:50:29 +00:00
Simon Pilgrim
0d7db2b064 [DAGCombine] visitMUL - allow shift by zero in MulByConstant.
This can occur under certain circumstances when undefs are created later on in the constant multipliers (e.g. in this case due to SimplifyDemandedVectorElts). Its better to let the shift by zero to occur and perform any cleanup afterward.

Fixes OSS Fuzz #15429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364179 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-24 12:47:17 +00:00