Commit Graph

21890 Commits

Author SHA1 Message Date
Rafael Espindola
9aafb854cc Delete Default and JITDefault code models
IMHO it is an antipattern to have a enum value that is Default.

At any given piece of code it is not clear if we have to handle
Default or if has already been mapped to a concrete value. In this
case in particular, only the target can do the mapping and it is nice
to make sure it is always done.

This deletes the two default enum values of CodeModel and uses an
explicit Optional<CodeModel> when it is possible that it is
unspecified.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309911 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-03 02:16:21 +00:00
Tom Stellard
58dd3a3775 AMDGPU/GlobalISel: Mark 32-bit G_FMUL as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D36218

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309898 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 22:56:30 +00:00
Stefan Pintilie
07635d3971 [Power9] Exploit vector absolute difference instructions on Power 9
Power 9 has instructions to do absolute difference (VABSDUB, VABSDUH, VABSDUW)
for byte, halfword and word. We should take advantage of these.

Differential Revision: https://reviews.llvm.org/D34684

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309876 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 20:07:21 +00:00
Evandro Menezes
7fee9f87f4 [AArch64] Add Exynos M2 feature test (NFC)
Test fusion of AES operations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309855 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 18:55:34 +00:00
Evandro Menezes
749993640b [AArch64] Improve the test of conditional branch fusion
Separate the checking of the fused pairings with B.cc and CBcc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309825 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 15:34:06 +00:00
Diana Picus
bb326e2526 [MIR] Print target-specific constant pools
This should enable us to test the generation of target-specific constant
pools, e.g. for ARM:

constants:
 - id:              0
   value:           'g(GOT_PREL)-(LPC0+8-.)'
   alignment:       4
   isTargetSpecific: true

I intend to use this to test PIC support in GlobalISel for ARM.

This is difficult to test outside of that context, since the existing
MIR tests usually rely on parser support as well, and that seems a bit
trickier to add. We could try to add a unit test, but the setup for that
seems rather convoluted and overkill.

We do test however that the parser reports a nice error when
encountering a target-specific constant pool.

Differential Revision: https://reviews.llvm.org/D36092

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309806 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 11:09:30 +00:00
Matt Arsenault
6023e68dae AMDGPU: Fix clobbering CSR VGPRs when spilling SGPR to it
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309783 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 01:52:45 +00:00
Matt Arsenault
5474c6d8df AMDGPU: Fix emitting encoded calls
This was failing on out of bounds access to the extra operands
on the s_swappc_b64 beyond those in the instruction definition.

This was working, but somehow regressed within the past few weeks,
although I don't see any obvious commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309782 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 01:42:04 +00:00
Matt Arsenault
23e59ddf6d AMDGPU: Analyze callee resource usage in AsmPrinter
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309781 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 01:31:28 +00:00
Matt Arsenault
01ddaaf508 AMDGPU: Don't place arguments in emergency stack slot
When finding the fixed offsets for function arguments,
this needs to skip over the 4 bytes reserved for the
emergency stack slot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309776 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 00:59:51 +00:00
Matt Arsenault
6e8db40d65 DAG: Undo and->or combine with FrameIndexes
This pattern shows up when lowering byval copies on AMDGPU.

The byval object access is split into 4-byte chunks, adding a
constant offset to the FixedStack base. When some of the offsets
turn into ors, this prevents combining the constant offsets.

This makes it not apparent that the object is there when matching
addressing modes, so it ends up using a scratch wave offset
relative access and the lengthy frame index expansion for that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309775 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 00:43:42 +00:00
Matthias Braun
3a6f6d93bf X86: Do not use llc -march in tests.
`llc -march` is problematic because it only switches the target
architecture, but leaves the operating system unchanged. This
occasionally leads to indeterministic tests because the OS from
LLVM_DEFAULT_TARGET_TRIPLE is used.

However we can simply always use `llc -mtriple` instead. This changes
all the tests to do this to avoid people using -march when they copy and
paste parts of tests.

See also the discussion in https://reviews.llvm.org/D35287

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309774 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 00:28:10 +00:00
Stanislav Mekhanoshin
3499304a9a [AMDGPU] Turn s_and_saveexec_b64 into s_and_b64 if result is unused
With SI_END_CF elimination for some nested control flow we can now
eliminate saved exec register completely by turning a saveexec version
of instruction into just a logical instruction.

Differential Revision: https://reviews.llvm.org/D36007

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309766 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 23:44:35 +00:00
Stanislav Mekhanoshin
5b53ac928d [AMDGPU] Collapse adjacent SI_END_CF
Add a pass to remove redundant S_OR_B64 instructions enabling lanes in
the exec. If two SI_END_CF (lowered as S_OR_B64) come together without any
vector instructions between them we can only keep outer SI_END_CF, given
that CFG is structured and exec bits of the outer end statement are always
not less than exec bit of the inner one.

This needs to be done before the RA to eliminate saved exec bits registers
but after register coalescer to have no vector registers copies in between
of different end cf statements.

Differential Revision: https://reviews.llvm.org/D35967

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309762 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 23:14:32 +00:00
Matthias Braun
fe7581c1d1 ARM: Do not use llc -march in tests.
`llc -march` is problematic because it only switches the target
architecture, but leaves the operating system unchanged. This
occasionally leads to indeterministic tests because the OS from
LLVM_DEFAULT_TARGET_TRIPLE is used.

However we can simply always use `llc -mtriple` instead. This changes
all the tests to do this to avoid people using -march when they copy and
paste parts of tests.

See also the discussion in https://reviews.llvm.org/D35287

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309755 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 22:20:49 +00:00
Matthias Braun
b0a0439255 PowerPC: Do not use llc -march in tests.
`llc -march` is problematic because it only switches the target
architecture, but leaves the operating system unchanged. This
occasionally leads to indeterministic tests because the OS from
LLVM_DEFAULT_TARGET_TRIPLE is used.

However we can simply always use `llc -mtriple` instead. This changes
all the tests to do this to avoid people using -march when they copy and
paste parts of tests.

This patch:
- Removes -march if the .ll file already has a matching `target triple`
  directive or -mtriple argument.
- In all other cases changes -march=ppc32/-march=ppc64 to
  -mtriple=ppc32--/-mtriple=ppc64--

See also the discussion in https://reviews.llvm.org/D35287

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309754 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 22:20:41 +00:00
Adrian Prantl
2cd77a8486 Remove PrologEpilogInserter's usage of DBG_VALUE's offset field
In the last half-dozen commits to LLVM I removed code that became dead
after removing the offset parameter from llvm.dbg.value gradually
proceeding from IR towards the backend. Before I can move on to
DwarfDebug and friends there is one last side-called offset I need to
remove:  This patch modifies PrologEpilogInserter's use of the
DBG_VALUE's offset argument to use a DIExpression instead. Because the
PrologEpilogInserter runs at the Machine level I had to play a little
trick with a named llvm.dbg.mir node to get the DIExpressions to print
in MIR dumps (which print the llvm::Module followed by the
MachineFunction dump).

I also had to add rudimentary DwarfExpression support to CodeView and
as a side-effect also fixed a bug (CodeViewDebug::collectVariableInfo
was supposed to give up on variables with complex DIExpressions, but
would fail to do so for fragments, which are also modeled as
DIExpressions).

With this last holdover removed we will have only one canonical way of
representing offsets to debug locations which will simplify the code
in DwarfDebug (and future versions of CodeViewDebug once it starts
handling more complex expressions) and make it easier to reason about.

This patch is NFC-ish: All test case changes are for assembler
comments and the binary output does not change.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D36125

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309751 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 21:45:24 +00:00
Martin Storsjo
39bec211f0 [AArch64] Rewrite stack frame handling for win64 vararg functions
The previous attempt, which made do with a single offset in
computeCalleeSaveRegisterPairs, wasn't quite enough. The previous
attempt only worked as long as CombineSPBump == true (since the
offset would be adjusted later in fixupCalleeSaveRestoreStackOffset).

Instead include the size for the fixed stack area used for win64
varargs in calculations in emitPrologue/emitEpilogue. The stack
consists of mainly three parts;
- AFI->getLocalStackSize()
- AFI->getCalleeSavedStackSize()
- FixedObject

Most of the places in the code which previously used the CSStackSize
now use PrologueSaveSize instead, which is the sum of the latter
two, while some cases which need exactly the middle one use
AFI->getCalleeSavedStackSize() explicitly instead of a local variable.

In addition to moving the offsetting into emitPrologue/emitEpilogue
(which fixes functions with CombineSPBump == false), also set the
frame pointer to point to the right location, where the frame pointer
and link register actually are stored. In addition to the prologue/epilogue,
this also requires changes to resolveFrameIndexReference.

Add tests for a function that keeps a frame pointer and another one
that uses a VLA.

Differential Revision: https://reviews.llvm.org/D35919

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309744 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 21:13:54 +00:00
Matt Arsenault
f9a65f9c7e AMDGPU: Fix handling of div_scale with undef inputs
The src0 register must match src1 or src2, but if these
were undefined they could end up using different implicit_defed
virtual registers. Force these to use one undef vreg or pick the
defined other register.

Also fixes producing invalid nodes without the right number of
inputs when src2 is undef.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309743 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 20:49:41 +00:00
Matt Arsenault
ff9e21161d AMDGPU: Add test for r308774
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309733 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 19:54:58 +00:00
Matt Arsenault
43950949ad AMDGPU: Initial implementation of calls
Includes a hack to fix the type selected for
the GlobalAddress of the function, which will be
fixed by changing the default datalayout to use
generic pointers for 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309732 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 19:54:18 +00:00
Simon Pilgrim
7e482e3a79 [X86][SSE3] Add scheduler tests for MONITOR/MWAIT
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309718 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 18:16:44 +00:00
Nirav Dave
8790231fa6 Revert "[DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector."
This reverts commit r309680 which appears to be raising an assertion
in the test-suite.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309717 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 18:09:25 +00:00
Simon Pilgrim
8ae6d28100 [X86][SSE] Added missing vector logic intrinsic schedules
Improves atom scheduler test coverage (to make it easier to upgrade them for PR32431).

Merged SSE_VEC_BIT_ITINS_P + SSE_BIT_ITINS_P as we were interchanging between them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309715 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 17:51:20 +00:00
Sanjay Patel
5250bac12f [CGP] use narrower types in memcmp expansion when possible
This only affects very small memcmp on x86 for now, but it
will become more important if we allow vector-sized load and
compares.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309711 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 17:24:54 +00:00
Craig Topper
790133a052 [X86] Use BEXTR/BEXTRI for 64-bit 'and' with a large mask
Summary: The 64-bit 'and' with immediate instruction only supports a 32-bit immediate. So for larger constants we have to load the constant into a register first. If the immediate happens to be a mask we can use the BEXTRI instruction to perform the masking. We already do something similar using the BZHI instruction from the BMI2 instruction set.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36129

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309706 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 17:18:14 +00:00
Simon Pilgrim
42dea1205c [X86][SSE] Added missing PACKSS/PACKUS intrinsic schedules
Improves atom scheduler test coverage (to make it easier to upgrade them for PR32431).

Checked on Agner that these actually match the UNPACK schedules, but better to include a separate class

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309701 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 16:47:48 +00:00
Craig Topper
4e3a697007 [X86] Split bmi.ll into a bmi test and a bmi2 test.
This moves all the bmi2 specific intrinsics to a separate test file and adds a bmi1 only command line to the existing bmi test.

This will allow us to see the missed opportunity to use bextr to handle 64-bit 'and' with a large mask. This will be improved in an upcoming patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309700 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 16:45:11 +00:00
Simon Pilgrim
bba95e61a4 [X86][SSSE3] Added missing PHADDS/PHSUBS/PSIGN intrinsic schedules
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309699 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 16:18:25 +00:00
Manoj Gupta
f6fecfacea [X86] Fix a crash in FEntryInserter Pass.
Summary:
FEntryInserter pass unconditionally derefs the first Instruction
in the first Basic Block. The pass crashes when the first
BasicBlock is empty. Fix the crash by not dereferencing the basic
Block iterator. This fixes an issue observed when building Linux kernel
4.4 with clang.

Fixes PR33971.

Reviewers: hfinkel, niravd, dblaikie

Reviewed By: niravd

Subscribers: davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D35979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309694 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 15:39:12 +00:00
Craig Topper
2a519cbde9 [AVX-512] Don't use unmasked VMOVDQU8/16 for 8-bit or 16-bit element stores even when BWI instructions are supported. Always use VMOVDQA32/VMOVDQU32.
We were already using the 32 bit element opcode if BWI isn't enabled, but there's no reason to change opcode if we have BWI. We will still use the 8/16 opcodes for masked stores though.

This allows us to use the aligned opcode when we can which makes our test output more consistent between different modes. It also reduces the number of isel patterns we need.

This is a slight inconsistency with loads which default to 64 bit element opcodes. I'll probably rectify that in a future patch.

Differential Revision: https://reviews.llvm.org/D35978

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309693 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 15:31:24 +00:00
Simon Pilgrim
e105901600 [X86][SSSE3] Fix typos in pabsw/pmulhrsw tests for load folding scheduling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309692 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 15:31:24 +00:00
Simon Pilgrim
bf59b4ffe0 [X86] Added missing cpu to fix generic scheduling model tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309691 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 15:14:35 +00:00
Nirav Dave
bb2981861c [DAG] Extend visitSCALAR_TO_VECTOR optimization to truncated vector.
Summary:
Allow SCALAR_TO_VECTOR of EXTRACT_VECTOR_ELT to reduce to
EXTRACT_SUBVECTOR of vector shuffle when output is smaller. Marginally
improves vector shuffle computations.

Reviewers: efriedma, RKSimon, spatel

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35566

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309680 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 13:45:35 +00:00
Strahinja Petrovic
16b949e997 [Mips] Fix for BBIT octeon instruction
This patch enables control flow optimization for
variations of BBIT instruction. In this case
optimization removes unnecessary branch after
BBIT instruction.

Differential Revision: https://reviews.llvm.org/D35359


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309679 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 13:42:45 +00:00
Krzysztof Parzyszek
c6a42e96ed [Hexagon] Convert HVX vector constants of i1 to i8
Certain operations require vector of i1 values. However, for Hexagon
architecture compatibility, they need to be represented as vector of i8.

Patch by Suyog Sarda.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309677 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 13:12:53 +00:00
Simon Pilgrim
1b7afc891a [X86] Regenerate big structure return test and check on x86_64 as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309676 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 13:12:15 +00:00
Tom Stellard
c0ce68eb67 AMDGPU/GlobalISel: Add support for amdgpu_vs calling convention
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D35916

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309675 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 12:38:33 +00:00
Andrew V. Tischenko
052dd78cb3 Support itineraries in TargetSubtargetInfo::getSchedInfoStr - Now if the given instr does not have sched model then we try to calculate the latecy/throughput with help of itineraries.
Differential Revision https://reviews.llvm.org/D35997


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309666 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 09:15:43 +00:00
Eli Friedman
68720204a7 [ScheduleDAG] Don't schedule node with physical register interference
https://reviews.llvm.org/D31536 didn't really solve the problem it was
trying to solve; it got rid of the assertion failure, but we were still
scheduling the DAG incorrectly (mixing together instructions from
different calls), leading to a MachineVerifier failure.

In order to schedule the DAG correctly, we have to make sure we don't
schedule a node which should be blocked by an interference. Fix
ScheduleDAGRRList::PickNodeToScheduleBottomUp so it doesn't pick a node
like that.

The added call to FindAvailableNode() is the key change here; this makes
sure we don't try to schedule a call while we're in the middle of
scheduling a different call. I'm not sure this is the right approach; in
particular, I'm not sure how to prove we don't end up with an infinite
loop of repeatedly backtracking.

This also reverts the code change from D31536. It doesn't do anything
useful: we should never schedule an ADJCALLSTACKDOWN unless we've
already scheduled the corresponding ADJCALLSTACKUP.

Differential Revision: https://reviews.llvm.org/D33818



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309642 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 00:28:40 +00:00
Craig Topper
c28b6e3216 [AVX-512] Add unmasked subvector inserts and extract to the execution domain tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309632 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 22:07:29 +00:00
Craig Topper
5f3a835eaf [AVX512] Add a common prefix to avx512-insert-extract.ll so we can reduce the number of check lines on some test cases.
This was pointed out during the review for D313804.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309629 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 21:20:06 +00:00
Craig Topper
71201a4208 [AVX-512] Use AVX512 as test check prefix instead of AVX3. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309625 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 20:58:06 +00:00
Konstantin Belochapka
de4ee6c474 [X86][MMX] Added custom lowering action for MMX SELECT (PR30418)
Fix for pr30418 - error in backend: Cannot select: t17: x86mmx = select_cc t2, Constant:i64<0>, t7, t8, seteq:ch
Differential Revision: https://reviews.llvm.org/D34661





git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309614 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 20:11:49 +00:00
Quentin Colombet
6131fb56ca [TargetPassConfig] Feature generic options to setup start/stop-after/before
This patch refactors the code used in llc such that all the users of the
addPassesToEmitFile API have access to a homogeneous way of handling
start/stop-after/before options right out of the box.

In particular, just invoking addPassesToEmitFile will set the proper
pipeline without additional effort (modulo parsing a .mir file if the
start-before/after options are used.

NFC.

Differential Revision: https://reviews.llvm.org/D30913

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309599 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 18:24:07 +00:00
Sanjay Patel
8209d78723 [CGP] use subtract or subtract-of-cmps for result of memcmp expansion
As noted in the code comment, transforming this in the other direction might require 
a separate transform here in CGP given the block-at-a-time DAG constraint.

Besides that theoretical motivation, there are 2 practical motivations for the 
subtract-of-cmps form:

1. The codegen for both x86 and PPC is better for this IR (though PPC could be better still). 
   There is discussion about canonicalizing IR to the select form 
   ( http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html ), 
   so we probably need to add DAG transforms for those patterns anyway, but this improves the 
   memcmp output without waiting for that step.

2. If we allow vector-sized chunks for the load and compare, x86 is better prepared to convert
   that to optimal code when using subtract-of-cmps, so another prerequisite patch is avoided
   if we choose to enable that.

Differential Revision: https://reviews.llvm.org/D34904



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309597 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 18:08:24 +00:00
Craig Topper
af156f2d96 [AVX-512] Remove patterns that select vmovdqu8/16 for unmasked loads. Prefer vmovdqa64/vmovdqu64 instead.
These were taking priority over the aligned load instructions since there is no vmovda8/16. I don't think there is really a difference between aligned and unaligned on newer cpus so I don't think it matters which instructions we use.

But with this change we reduce the size of the isel table a little and we allow the aligned information to pass through to the evex->vec pass and produce the same output has avx/avx2 in some cases.

I also generally dislike patterns rooted in a bitcast which these were.

Differential Revision: https://reviews.llvm.org/D35977

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309589 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 17:35:44 +00:00
Aditya Nandakumar
d98de6bf35 [GISel]: Support Widening G_ICMP's destination operand.
Updated AArch64 to widen destination to s32.
https://reviews.llvm.org/D35737

Reviewed by Tim

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309579 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 17:00:16 +00:00
Simon Pilgrim
0bc57f232f [X86] Extending a test cases for LEA factorization.
Submitted on the behalf of Jatin Bhateja

Differential Revision: https://reviews.llvm.org/D36048

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309565 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 14:23:28 +00:00
Simon Dardis
8154453cfb [SelectionDAG][mips] Fix PR33883
PR33883 shows that calls to intrinsic functions should not have their vector
arguments or returns subject to ABI changes required by the target.

This resolves PR33883.

Thanks to Alex Crichton for reporting the issue!

Reviewers: zoran.jovanovic, atanasyan

Differential Revision: https://reviews.llvm.org/D35765



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309561 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 14:06:58 +00:00