26821 Commits

Author SHA1 Message Date
Simon Pilgrim
c442198d91 [X86][Btver2] Fix BT(C|R|S)mr & BT(C|R|S)mi schedule latency + uop counts
Match AMD Fam16h SOG + llvm-exegesis tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343494 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-01 16:31:30 +00:00
Matthias Braun
c2625345a6 DAGCombiner: StoreMerging: Fix bad index calculating when adjusting mismatching vector types
This fixes a case of bad index calculation when merging mismatching
vector types. This changes the existing code to just use the existing
extract_{subvector|element} and a bitcast (instead of bitcast first and
then newly created extract_xxx) so we don't need to adjust any indices
in the first place.

rdar://44584718

Differential Revision: https://reviews.llvm.org/D52681

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343493 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-01 16:25:50 +00:00
Sanjay Patel
f8aaa5b2fb [x86] add tests for 256- and 512-bit vector types for scalar-to-vector transform; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343491 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-01 16:17:18 +00:00
Simon Atanasyan
b4f7a97945 [mips] Generate tests expectations using update_llc_test_checks. NFC
Generate tests expectations using update_llc_test_checks and reduce
number of "check prefixes" used in the tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343485 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-01 14:43:07 +00:00
Clement Courbet
d80543967d [X86][Sched] Update scheduling information for VZEROALL on HWS, BDW, SKX, SNB.
Summary:
    While looking at PR35606, I found out that the scheduling info is incorrect.

    One can check that it's really a P5+P6 and not a 2*P56 with:
    echo -e 'vzeroall\nvandps %xmm1, %xmm2, %xmm3' | ./bin/llvm-exegesis -mode=uops -snippets-file=-
    (vandps executes on P5 only)

    Reviewers: craig.topper, RKSimon

    Subscribers: llvm-commits

    Differential Revision: https://reviews.llvm.org/D52541

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343447 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-01 08:37:48 +00:00
Carlos Alberto Enciso
42b5443507 [DebugInfo][Dexter] Incorrect DBG_VALUE after MCP dead copy instruction removal.
When MachineCopyPropagation eliminates a dead 'copy', its associated debug information becomes invalid. as the recorded register has been removed.  It causes the debugger to display wrong variable value.

Differential Revision: https://reviews.llvm.org/D52614

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343445 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-01 08:14:44 +00:00
Clement Courbet
e7e54cd186 [CodeGen][NFC] Add tests for heterogeneous types in MergeConsecutiveStores
Reviewers: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52643

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343444 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-01 07:16:22 +00:00
Craig Topper
62c612fdec [X86] Stop X86DomainReassignment from creating copies between GR8/GR16 physical registers and k-registers.
We can only copy between a k-register and a GR32/GR64 register.

This patch detects that the copy will be illegal and prevents the domain reassignment from happening for that closure.

This probably isn't the best fix, and we should probably figure out how to handle this correctly.

Fixes PR38803.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343443 91177308-0d34-0410-b5e6-96231b3b80d8
2018-10-01 07:08:41 +00:00
Simon Pilgrim
abccef1dfd [X86] Fix scheduler class for BTmi instructions
This wasn't treated as a folded load instruction

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343424 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-30 20:19:16 +00:00
Bjorn Pettersson
cce9df6525 [PHIElimination] Lower a PHI node with only undef uses as IMPLICIT_DEF
Summary:
The lowering of PHI nodes used to detect if all inputs originated
from IMPLICIT_DEF's. If so the PHI node was replaced by an
IMPLICIT_DEF. Now we also consider undef uses when checking the
inputs. So if all inputs are implicitly defined or undef we
lower the PHI to an IMPLICIT_DEF. This makes
PHIElimination::LowerPHINode more consistent as it checks
both implicit and undef properties at later stages.

Reviewers: MatzeB, tstellar

Reviewed By: MatzeB

Subscribers: jvesely, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D52558

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343417 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-30 17:26:58 +00:00
Bjorn Pettersson
e299be6492 [PHIElimination] Update the regression test for PR16508
Summary:
When PR16508 was solved (in rL185363) a regression test was
added as test/CodeGen/PowerPC/2013-07-01-PHIElimBug.ll.
I discovered that the test case no longer reproduced the
scenario from PR16508. This problem could have been amended
by adding an extra RUN line with "-O1" (or possibly "-O0"),
but instead I added a mir-reproducer
  test/CodeGen/PowerPC/2013-07-01-PHIElimBug.mir
to get a reproducer that is less sensitive to changes in
earlier passes (including O-level).

While being at it I also corrected a code comment in
PHIElimination::EliminatePHINodes that has been incorrect
since the related bugfix from rL185363.

Reviewers: MatzeB, hfinkel

Reviewed By: MatzeB

Subscribers: nemanjai, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52553

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343416 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-30 17:23:21 +00:00
Roman Lebedev
219704f8ee [NFC][CodeGen][X86][AArch64] Add 64-bit constant bit field extract pattern tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343404 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-30 12:42:08 +00:00
Simon Pilgrim
d09ede4828 [X86] Regenerate MMX coalescing test
Exposes another extractelement(bitcast(scalartovector())) pattern

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343403 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-30 09:42:04 +00:00
Craig Topper
ecbe2f8e5e [X86] Disable BMI BEXTR in X86DAGToDAGISel::matchBEXTRFromAnd unless we're on compiling for a CPU with single uop BEXTR
Summary:
This function turns (X >> C1) & C2 into a BMI BEXTR or TBM BEXTRI instruction. For BMI BEXTR we have to materialize an immediate into a register to feed to the BEXTR instruction.

The BMI BEXTR instruction is 2 uops on Intel CPUs. It looks like on SKL its one port 0/6 uop and one port 1/5 uop. Despite what Agner's tables say. I know one of the uops is a regular shift uop so it would have to go through the port 0/6 shifter unit. So that's the same or worse execution wise than the shift+and which is one 0/6 uop and one 0/1/5/6 uop. The move immediate into register is an additional 0/1/5/6 uop.

For now I've limited this transform to AMD CPUs which have a single uop BEXTR. If may also might make sense if we can fold a load or if the and immediate is larger than 32-bits and can't be encoded as a sign extended 32-bit value or if LICM or CSE can hoist the move immediate and share it. But we'd need to look more carefully at that. In the regression I looked at it doesn't look load folding or large immediates were occurring so the regression isn't caused by the loss of those. So we could try to be smarter here if we find a compelling case.

Reviewers: RKSimon, spatel, lebedev.ri, andreadb

Reviewed By: RKSimon

Subscribers: llvm-commits, andreadb, RKSimon

Differential Revision: https://reviews.llvm.org/D52570

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343399 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-30 03:01:46 +00:00
David Bolvansky
21b46b07d7 [DAGCombiner][NFC] Tests for X div/rem Y single bit fold
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343392 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-29 21:00:37 +00:00
Simon Pilgrim
ca85b1757c [X86][AVX2] Cleanup shuffle combining tests - add common prefixes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343391 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-29 20:34:16 +00:00
Simon Pilgrim
bbb291c0d3 [X86] SimplifyDemandedVectorEltsForTargetNode - remove identity target shuffles before simplifying inputs
By removing demanded target shuffles that simplify to zero/undef/identity before simplifying its inputs we improve chances of further simplification, as only the immediate parent user of the combined is added back to the work list - this still doesn't help us if its passed through other ops though (bitcasts....).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343390 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-29 18:15:26 +00:00
Craig Topper
59b80ed4d5 [X86] Add fast-isel test cases for unaligned load/store intrinsics recently added to clang
This adds tests for:
_mm_loadu_si16
_mm_loadu_si32
_mm_loadu_si16
_mm_storeu_si64
_mm_storeu_si32
_mm_storeu_si16

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343389 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-29 18:03:52 +00:00
Simon Pilgrim
bb664accc2 [X86] getTargetConstantBitsFromNode - add support for rearranging constant bits via shuffles
Exposed an issue that recursive calls to getTargetConstantBitsFromNode don't handle changes to EltSizeInBits yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343384 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-29 17:01:55 +00:00
Simon Pilgrim
db283fd1bb [X86] Regenerate fma comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343376 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-29 14:31:00 +00:00
Simon Pilgrim
2a55d07e77 [X86] getTargetConstantBitsFromNode - add support for peeking through ISD::EXTRACT_SUBVECTOR
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343375 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-29 14:17:32 +00:00
Simon Pilgrim
a1ab644481 [X86][SSE] Fixed issue with v2i64 variable shifts on 32-bit targets
The shift amount might have peeked through a extract_subvector, altering the number of vector elements in the 'Amt' variable - so we were incorrectly calculating the ratio when peeking through bitcasts, resulting in incorrectly detecting splats.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343373 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-29 13:25:22 +00:00
Eli Friedman
b10248393f [ARM] Fix correctness checks in promoteToConstantPool.
Correctly check for relocations in the constant to promote. And don't
allow promoting a constant multiple times.

This partially fixes https://bugs.llvm.org//show_bug.cgi?id=32780 ;
it's not a complete fix because we also need to prevent
ARMConstantIslands from cloning the constant.

(-arm-promote-constant is currently off by default, and it stays off
with this patch. I'll look into turning it on again when all the known
issues are fixed.)

Differential Revision: https://reviews.llvm.org/D51472



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343361 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 20:27:31 +00:00
Eli Friedman
770516eb32 [ARM] Use preferred alignment for constants in promoteToConstantPool.
This mostly affects IR generated by non-clang frontends because clang
generally sets the alignment of globals explicitly.

Fixes https://bugs.llvm.org//show_bug.cgi?id=32394 .

(-arm-promote-constant is currently off by default, and it stays off
with this patch. I'll look into turning it on again when all the known
issues are fixed.)

Differential Revision: https://reviews.llvm.org/D51469



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343359 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 20:21:51 +00:00
Craig Topper
46f964d38a [X86] Add test cases for failures to use narrow test with immediate instructions when a truncate is beteen the CMP and the AND and the sign flag is used.
The code in X86ISelDAGToDAG only looks through truncates if the sign flag isn't used, but that is overly restrictive. A future patch will improve this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343355 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 19:06:28 +00:00
Evandro Menezes
6fbdafa0d5 [AArch64] Split zero cycle feature more granularly
Split the `zcz` feature into specific ones got GP and FP registers, `zcz-gp`
and `zcz-fp`, respectively, while retaining the original feature option to
mean both.

Differential revision: https://reviews.llvm.org/D52621

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343354 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 19:05:09 +00:00
Luke Cheeseman
4c6cb9f427 Revert r343317
- asan buildbots are breaking and I need to investigate the issue



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343341 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 17:01:50 +00:00
Aditya Nandakumar
88ba134609 [GISel]: Remove an incorrect assert in CallLowering
https://reviews.llvm.org/D51147

Asserting if any extend of vectors should be up to the target's
legalizer/target specific code not in CallLowering.

reviewed by : dsanders.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343325 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 15:08:49 +00:00
Luke Cheeseman
96f82d5c18 Reapply changes reverted by r343235
- Add fix so that all code paths that create DWARFContext
  with an ObjectFile initialise the target architecture in the context
- Add an assert that the Arch is known in the Dwarf CallFrameString method



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343317 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 13:37:27 +00:00
Petar Jovanovic
46304bd3ad [MIPS GlobalISel] Lower i64 arguments
Lower integer arguments larger then 32 bits for MIPS32.
setMostSignificantFirst is used in order for G_UNMERGE_VALUES and
G_MERGE_VALUES to always hold registers in same order, regardless of
endianness.

Patch by Petar Avramovic.

Differential Revision: https://reviews.llvm.org/D52409



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343315 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 13:28:47 +00:00
Jonas Devlieghere
edfa885345 Split invocations in CodeGen/X86/cpus.ll among multiple tests. (NFC)
On GreenDragon `CodeGen/X86/cpus.ll` is timing out on the bot with Asan
and UBSan enabled. With the same configuration on my machine, the test
passes but takes more than 3 minutes to do so. I could increase the
timeout, but I believe it makes more sense to split up the test because
it allows for more parallelism.

Differential revision: https://reviews.llvm.org/D52603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343313 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 12:08:51 +00:00
Simon Pilgrim
ab5d6ab2de [X86][Btver2] Fix BSF/BSR schedule
Double throughput to account for 2 pipes + fix BSF's latency/uop counts

Match AMD Fam16h SOG + llvm-exegesis tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343311 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 10:26:48 +00:00
David Spickett
c2c1024556 [ARM] Allow execute only code on Cortex-m23
The NoMovt feature prevents the use of MOVW/MOVT
instructions on Cortex-M23 for performance reasons.
These instructions are required for execute only code
so NoMovt should be disabled when that option is enabled.

Differential Revision: https://reviews.llvm.org/D52551



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343302 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 08:55:19 +00:00
Simon Pilgrim
1076f420d2 [X86][BtVer2] Fix PHMINPOS schedule resources typo
PHMINPOS can run on either JFPU pipe

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343299 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-28 08:21:39 +00:00
Craig Topper
1b2b5a55db [X86] Add the test case from PR38986.
The assembly for this test should be optimal now after changes to the ScalarizeMaskedMemIntrin patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343281 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-27 23:25:10 +00:00
Craig Topper
2122d188d1 [ScalarizeMaskedMemIntrin] When expanding masked gathers, start with the passthru vector and insert the new load results into it.
Previously we started with undef and did a final merge with the passthru at the end.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343273 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-27 21:28:59 +00:00
Craig Topper
4f95400123 [ScalarizeMaskedMemIntrin] When expanding masked loads, start with the passthru value and insert each conditional load result over their element.
Previously we started with undef and did one final merge at the end with a select.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343271 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-27 21:28:52 +00:00
Stanislav Mekhanoshin
45657b2c3b [AMDGPU] Fold copy (copy vgpr)
This allows to reduce a number of used VGPRs in some cases.

Differential Revision: https://reviews.llvm.org/D52577

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343249 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-27 18:55:20 +00:00
Craig Topper
b41c4e18ff [ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask values. That's just %x so use that directly.
Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343244 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-27 18:01:48 +00:00
Luke Cheeseman
4d42c19cd1 Revert r343192 as an ubsan build is currently failing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343235 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-27 16:47:30 +00:00
Daniel Cederman
930ab34cd5 [Sparc] Remove the support for builtin setjmp/longjmp
Summary: It is currently broken and for Sparc there is not much benefit
in using a builtin version compared to a library version. Both versions
needs to store the same four values in setjmp and flush the register
windows in longjmp. If the need for a builtin setjmp/longjmp arises there
is an improved implementation available at https://reviews.llvm.org/D50969.

Reviewers: jyknight, joerg, venkatra

Subscribers: fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D51487

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343210 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-27 13:32:54 +00:00
Luke Cheeseman
a26da93994 Reapply changes reverted in r343114, lldb patch to follow shortly
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343192 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-27 10:39:20 +00:00
Craig Topper
23ecc2ebf7 [X86] Update tzcnt fast-isel tests to match clang r343126.
We now generate cttz with the zero_undef flag set to false. This allows -O0 to avoid the zero check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343127 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-26 17:19:28 +00:00
Luke Cheeseman
9f0b248da5 Revert r343112 as CallFrameString API change has broken lldb builds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343114 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-26 14:48:03 +00:00
Luke Cheeseman
538d8c7d85 [AArch64] - Return address signing dwarf support
- Reapply r343089 with a fix for DebugInfo/Sparc/gnu-window-save.ll



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343112 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-26 14:30:29 +00:00
Francis Visoiu Mistrih
fa73cae344 [CodeGen] Always print register ties in MI::dump()
It was the case when calling MO::dump(), but MI::dump() was still
depending on hasComplexRegisterTies().

The MIR output is not affected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343107 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-26 13:33:09 +00:00
Hans Wennborg
21491992cf Revert r343089 "[AArch64] - Return address signing dwarf support"
This caused the DebugInfo/Sparc/gnu-window-save.ll test to fail.

> Functions that have signed return addresses need additional dwarf support:
> - After signing the LR, and before authenticating it, the LR register is in a
>   state the is unusable by a debugger or unwinder
> - To account for this a new directive, .cfi_negate_ra_state, is added
> - This directive says the signed state of the LR register has now changed,
>   i.e. unsigned -> signed or signed -> unsigned
> - This directive has the same CFA code as the SPARC directive GNU_window_save
>   (0x2d), adding a macro to account for multiply defined codes
> - This patch matches the gcc implementation of this support:
>   https://patchwork.ozlabs.org/patch/800271/
>
> Differential Revision: https://reviews.llvm.org/D50136

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343103 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-26 12:57:45 +00:00
Hiroshi Inoue
54312c3d1b [PowerPC] optimize conditional branch on CRSET/CRUNSET
This patch adds a check to optimize conditional branch (BC and BCn) based on a constant set by CRSET or CRUNSET.
Other optimizers, such as block placement, may generate such code and hence
I do this at the very end of the optimization in pre-emit peephole pass.

A conditional branch based on a constant is eliminated or converted into unconditional branch. 
Also CRSET/CRUNSET is eliminated if the condition code register is not used
by instruction other than the branch to be optimized.

Differential Revision: https://reviews.llvm.org/D52345



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343100 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-26 12:32:45 +00:00
Simon Pilgrim
65bcc34568 [X86][SSE] Refresh PR34947 test code to handle D52504
The previously reduced version used urem <9 x i32> zeroinitializer, %tmp which D52504 will simplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343097 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-26 11:53:51 +00:00
Simon Pilgrim
88c138a88a [X86][SSE] Use ISD::MULHS for constant vXi16 ISD::SRA lowering (PR38151)
Similar to the existing ISD::SRL constant vector shifts from D49562, this patch adds ISD::SRA support with ISD::MULHS.

As we're dealing with signed values, we have to handle shift by zero and shift by one special cases, so XOP+AVX2/AVX512 splitting/extension is still a better solution - really we should still use ISD::MULHS if one of the special cases are used but for now I've just left a TODO and filtered by isKnownNeverZero.

Differential Revision: https://reviews.llvm.org/D52171

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343093 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-26 10:57:05 +00:00