Commit Graph

29402 Commits

Author SHA1 Message Date
Kevin P. Neal
e972c4e0da Add AVX support to this test.
Requested by Craig Topper and Andrew Kaylor as part of D55897.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359461 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-29 16:06:04 +00:00
Simon Pilgrim
9043d57972 [X86][SSE] Add scalar horizontal add/sub tests for non-0/1 element extractions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359454 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-29 14:26:27 +00:00
Simon Pilgrim
ee8d96fcfb [X86][SSE] Moved haddps test from phaddsub.ll to haddsub.ll (D61245)
Also merged duplicate PR39921 + PR39936 tests

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359437 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-29 11:30:47 +00:00
Diogo N. Sampaio
87ba52d9d8 [ARM] Add bitcast/extract_subvec. of fp16 vectors
Summary:
This patch adds some basic operations for fp16
vectors, such as bitcast from fp16 to i16,
required to perform extract_subvector (also added
here) and extract_element.

Reviewers: SjoerdMeijer, DavidSpickett, t.p.northover, ostannard

Reviewed By: ostannard

Subscribers: javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60618

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359433 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-29 10:28:07 +00:00
Diogo N. Sampaio
bb84648816 [ARM] Add v4f16 and v8f16 types to the CallingConv
Summary:
The Procedure Call Standard for the Arm Architecture
states that float16x4_t and float16x8_t behave just
as uint16x4_t and uint16x8_t for argument passing.
This patch adds the fp16 vectors to the
ARMCallingConv.td file.

Reviewers: miyuki, ostannard

Reviewed By: ostannard

Subscribers: ostannard, javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60720


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359431 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-29 10:10:37 +00:00
Simon Pilgrim
b7310f5389 [X86] Add PR39921 HADD pairwise reduction test and AVX2 test coverage
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359409 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-28 21:04:47 +00:00
Simon Pilgrim
f3be67d655 [X86][AVX] Add fast-hops target for add/fadd reduction tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359408 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-28 20:04:08 +00:00
Simon Pilgrim
b4966362f9 [X86] Add PR39936 HADD Tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359407 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-28 20:03:11 +00:00
Simon Pilgrim
2fa97aca16 [X86][AVX] Enabled AVX512F tests and add PR40815 test case
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359401 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-28 15:04:30 +00:00
Simon Pilgrim
a53eda72ec [X86][AVX] Combine non-lane crossing binary shuffles using X86ISD::VPERMV3
Some of the combines might be further improved if we lower more shuffles with X86ISD::VPERMV3 directly, instead of waiting to combine the results.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359400 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-28 14:31:01 +00:00
Sanjay Patel
750719fb74 [SelectionDAG] include FP min/max variants as binary operators
The x86 test diffs don't look great because of extra move ops,
but FP min/max should clearly be included in the list.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359399 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-28 13:19:29 +00:00
Sanjay Patel
88f49fa5ef [DAGCombiner] try repeated fdiv divisor transform before building estimate
This was originally part of D61028, but it's an independent diff.

If we try the repeated divisor reciprocal transform before producing an estimate sequence,
then we have an opportunity to use scalar fdiv. On x86, the trade-off is 1 divss vs. 5
vector FP ops in the default estimate sequence. On recent chips (Skylake, Ryzen), the
full-precision division is only 3 cycle throughput, so that's probably the better perf
default option and avoids problems from x86's inaccurate estimates.

The last 2 tests show that users still have the option to override the defaults by using
the function attributes for reciprocal estimates, but those patterns are potentially made
faster by converting the vector ops (including ymm ops) to scalar math.

Differential Revision: https://reviews.llvm.org/D61149

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359398 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-28 12:23:43 +00:00
Simon Pilgrim
509a83cf30 [X86][SSE] Optimize llvm.experimental.vector.reduce.xor.vXi1 parity reduction (PR38840)
An xor reduction of a bool vector can be optimized to a parity check of the MOVMSK/BITCAST'd integer - if the population count is odd return 1, else return 0.

Differential Revision: https://reviews.llvm.org/D61230

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359396 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-28 10:46:17 +00:00
Simon Pilgrim
1571d111a2 [X86][AVX] Add AVX512DQ coverage for masked memory ops tests (PR34584)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359395 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-28 10:02:34 +00:00
Craig Topper
2b887560b6 [X86] Remove (V)MOV64toSDrr/m and (V)MOVDI2SSrr/m. Use 128-bit result MOVD/MOVQ and COPY_TO_REGCLASS instead
Summary:
The register form of these instructions are CodeGenOnly instructions that cover
GR32->FR32 and GR64->FR64 bitcasts. There is a similar set of instructions for
the opposite bitcast. Due to the patterns using bitcasts these instructions get
marked as "bitcast" machine instructions as well. The peephole pass is able to
look through these as well as other copies to try to avoid register bank copies.

Because FR32/FR64/VR128 are all coalescable to each other we can end up in a
situation where a GR32->FR32->VR128->FR64->GR64 sequence can be reduced to
GR32->GR64 which the copyPhysReg code can't handle.

To prevent this, this patch removes one set of the 'bitcast' instructions. So
now we can only go GR32->VR128->FR32 or GR64->VR128->FR64. The instruction that
converts from GR32/GR64->VR128 has no special significance to the peephole pass
and won't be looked through.

I guess the other option would be to add support to copyPhysReg to just promote
the GR32->GR64 to a GR64->GR64 copy. The upper bits were basically undefined
anyway. But removing the CodeGenOnly instruction in favor of one that won't be
optimized seemed safer.

I deleted the peephole test because it couldn't be made to work with the bitcast
instructions removed.

The load version of the instructions were unnecessary as the pattern that selects
them contains a bitcasted load which should never happen.

Fixes PR41619.

Reviewers: RKSimon, spatel

Reviewed By: RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61223

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359392 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-28 06:25:33 +00:00
Simon Pilgrim
723da7d379 Revert rL359389: [X86][SSE] Add support for <64 x i1> bool reduction
Minor generalization of the existing <32 x i1> pre-AVX2 split code.
........
Causing irregular buildbot failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359391 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-27 20:44:08 +00:00
Simon Pilgrim
f0586753f1 [X86][AVX] Add additional SSE/AVX expandload and compressstore targets
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359390 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-27 20:20:02 +00:00
Simon Pilgrim
df4ebd860c [X86][SSE] Add support for <64 x i1> bool reduction
Minor generalization of the existing <32 x i1> pre-AVX2 split code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359389 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-27 20:04:44 +00:00
Simon Pilgrim
2060bec1eb [X86][AVX] Cleanup and add additional expandload and compressstore tests
sort order by types and add vXi32/vXi16/vXi8 test coverage

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359388 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-27 19:57:34 +00:00
Simon Pilgrim
8c3e398a3c [X86][AVX512] Improve vector bool reductions
As predicate masks are legal on AVX512 targets, we avoid MOVMSK in these cases, but we can just bitcast the bool vector to the integer equivalent directly - avoiding expansion of the reduction to a shuffle pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359386 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-27 17:32:46 +00:00
Simon Pilgrim
416f89e2cc [X86] Add vector boolean reduction tests (PR38840)
AND/OR/XOR tests for the @llvm.experimental.vector.reduce intrinsics

AND/OR are pretty good (pre-AVX512), XOR (not so common but used for parity reduction) is still pretty bad.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359385 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-27 16:49:54 +00:00
Simon Pilgrim
5378974fba Fix check-prefixes typo
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359382 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-27 15:41:14 +00:00
Simon Pilgrim
93fdfcf73b [X86][SSE] Add initial test case for subvector insert/extract of illegal types
Suggested by @nikic on D59188

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359379 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-27 15:30:06 +00:00
Simon Pilgrim
14033682e6 [X86][AVX] Merge mask select with shuffles across extract_subvector (PR40332)
Fixes PR40332 in the limited case where we're selecting between a target shuffle and a zero vector.

We can extend this in the future to handle more opcodes and non-zero selections.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359378 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-27 13:35:32 +00:00
Craig Topper
bb63d99309 [X86] Use MOVQ for i64 atomic_stores when SSE2 is enabled
Summary: If we have SSE2 we can use a MOVQ to store 64-bits and avoid falling back to a cmpxchg8b loop. If its a seq_cst store we need to insert an mfence after the store.

Reviewers: spatel, RKSimon, reames, jfb, efriedma

Reviewed By: RKSimon

Subscribers: hiraditya, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60546

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359368 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-27 03:38:15 +00:00
Mark Searles
dfc7fb5622 Revert "AMDGPU: Split block for si_end_cf"
This reverts commit 7a6ef30046.

We discovered some internal test failures, so reverting for now.

Differential Revision: https://reviews.llvm.org/D61213

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359363 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-27 00:51:18 +00:00
Jessica Paquette
2ee5a85229 [GlobalISel][AArch64] Use getConstantVRegValWithLookThrough for extracts
getConstantVRegValWithLookThrough does the same thing as the
getConstantValueForReg function, and has more visibility across GISel. Plus, it
supports looking through G_TRUNC, G_SEXT, and G_ZEXT. So, we get better code
reuse and more functionality for free by using it.

Add some test cases to select-extract-vector-elt.mir to show that we can now
look through those instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359351 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-26 21:53:13 +00:00
Nick Desaulniers
b3cb8ab451 [AsmPrinter] refactor to support %c w/ GlobalAddress'
Summary:
Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when
printing the address of a MachineOperand::MO_GlobalAddress. Move that
handling into a new overriden method in each base class. A virtual
method was added to the base class for handling the generic case.

Refactors a few subclasses to support the target independent %a, %c, and
%n.

The patch also contains small cleanups for AVRAsmPrinter and
SystemZAsmPrinter.

It seems that NVPTXTargetLowering is possibly missing some logic to
transform GlobalAddressSDNodes for
TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended
inline assembly asm constraints.

Fixes:
- https://bugs.llvm.org/show_bug.cgi?id=41402
- https://github.com/ClangBuiltLinux/linux/issues/449

Reviewers: echristo, void

Reviewed By: void

Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60887

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359337 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-26 18:45:04 +00:00
Simon Pilgrim
4db70c21dc [X86][AVX] Fold extract_subvector(broadcast(x)) -> broadcast(x) iff x has one use
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359332 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-26 18:02:14 +00:00
Jessica Paquette
05a31451d2 [AArch64][GlobalISel] Select G_BSWAP for vectors of s32 and s64
There are instructions for these, so mark them as legal. Select the correct
instruction in AArch64InstructionSelector.cpp.

Update select-bswap.mir and arm64-rev.ll to reflect the changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359331 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-26 18:00:01 +00:00
Stanislav Mekhanoshin
09f8a0f6a0 [AMDGPU] gfx1010 VOP3 and VOP3P implementation
Differential Revision: https://reviews.llvm.org/D61202

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359328 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-26 17:56:03 +00:00
Stanislav Mekhanoshin
834873d34d [AMDGPU] gfx1010 VOP2 changes
Differential Revision: https://reviews.llvm.org/D61156

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359316 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-26 16:37:51 +00:00
Sanjay Patel
d3cefe91ff [x86] add tests for fmin/fmax; NFC
'maximum' and 'minimum' still crash, so they are commented out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359306 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-26 13:36:37 +00:00
Simon Pilgrim
cc6487ed56 [X86][SSE] Disable shouldFoldConstantShiftPairToMask for btver1/btver2 targets (PR40758)
As detailed on PR40758, Bobcat/Jaguar can perform vector immediate shifts on the same pipes as vector ANDs with the same latency - so it doesn't make sense to replace a shl+lshr with a shift+and pair as it requires an additional mask (with the extra constant pool, loading and register pressure costs).

Differential Revision: https://reviews.llvm.org/D61068

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359293 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-26 10:49:13 +00:00
Simon Pilgrim
8a0d120e52 [X86][AVX] Combine shuffles extracted from a common vector
A small step towards combining shuffles across vector sizes - this recognizes when a shuffle's operands are all extracted from the same larger source and tries to combine to an unary shuffle of that source instead. Fixes one of the test cases from PR34380.

Differential Revision: https://reviews.llvm.org/D60512

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359292 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-26 09:56:14 +00:00
Hans Wennborg
d51e703994 Fix alignment in AArch64InstructionSelector::emitConstantPoolEntry()
The code was using the alignment of a pointer to the value, not the
alignment of the constant itself.

Maybe we got away with it so far because the pointer alignment is
fairly high, but we did end up under-aligning <16 x i8> vectors,
which was caught in the Chromium build after lld stopped over-aligning
the .rodata.cst16 section in r356428. (See crbug.com/953815)

Differential revision: https://reviews.llvm.org/D61124

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359287 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-26 08:31:00 +00:00
Artem Belevich
9e3c94a04c PTX 6.3 extends wmma instruction to support s8/u8/s4/u4/b1 -> s32.
All of the new instructions are still handled mostly by tablegen. I've slightly
refactored the code to drive intrinsic/instruction generation from a master
list of supported variants, so all irregularities have to be implemented in one place only.

The test generation script wmma.py has been refactored in a similar way.

Differential Revision: https://reviews.llvm.org/D60015

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359247 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 22:27:57 +00:00
Artem Belevich
5446c675c4 [NVPTX] generate correct MMA instruction mnemonics with PTX63+.
PTX 6.3 requires using ".aligned" in the MMA instruction names.
In order to generate correct name, now we pass current
PTX version to each instruction as an extra constant operand
and InstPrinter adjusts its output accordingly.

Differential Revision: https://reviews.llvm.org/D59393

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359246 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 22:27:46 +00:00
Sanjay Patel
8f63d03e3a [x86] add tests for vector fdiv reciprocal estimate; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359238 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 20:35:47 +00:00
Jessica Paquette
9629553b79 [GlobalISel][AArch64] Make G_EXTRACT_VECTOR_ELT legal for v8s16s
This case was missing before, so we couldn't legalize it.

Add it to AArch64LegalizerInfo.cpp and update select-extract-vector-elt.mir.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359231 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 20:00:57 +00:00
Jessica Paquette
194fea7ca5 [GlobalISel][AArch64] Add generic legalization rule for extends
This adds a legalization rule for G_ZEXT, G_ANYEXT, and G_SEXT which allows
extends whenever the types will fit in registers (or the source is an s1).

Update tests. Add GISel checks throughout all of arm64-vabs.ll,
where we now select a good portion of the code. Add GISel checks to
arm64-subvector-extend.ll, which has a good number of vector extends in it.

Differential Revision: https://reviews.llvm.org/D60889

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359222 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 18:42:00 +00:00
Craig Topper
2d0f20640c [SelectionDAG][X86] Use stack load/store in PromoteIntRes_BITCAST when the input needs to be be split and the output type is a vector.
We had special case handling here, but it uses a scalar any_extend for the
promotion then bitcasts to the final type. This won't split up the input data
into multiple promoted elements like we need.

This patch falls back to doing the conversion through memory.

Fixes PR41594 which I believe was reflected in the bitcast-vector-bool.ll
changes. The changes to vector-half-conversions.ll are fixing a previously
unknown miscompile from this issue.

Differential Revision: https://reviews.llvm.org/D61114

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359219 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 18:19:59 +00:00
Jessica Paquette
e6dbec3d99 [GlobalISel][AArch64] Legalize G_FNEARBYINT
Add legalizer support for G_FNEARBYINT. It's the same as G_FCEIL etc.

Since the importer allows us to automatically select this after legalization,
also add tests for selection etc. Also update arm64-vfloatintrinsics.ll.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359204 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 16:44:40 +00:00
Jessica Paquette
f702279be5 [GlobalISel] Add IRTranslator support for G_FNEARBYINT
Translate llvm.nearbyint into G_FNEARBYINT as a simple intrinsic. Update
arm64-irtranslator.ll.

Differential Revision: https://reviews.llvm.org/D60922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359203 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 16:39:28 +00:00
Jessica Paquette
e0dc647b6b [GlobalISel] Add a G_FNEARBYINT opcode
For eventually selecting llvm.nearbyint. Equivalent to the SelectionDAG
nearbyint node.

Update legalizer-info-validation.mir.

Differential Revision: https://reviews.llvm.org/D60921

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359201 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 16:36:03 +00:00
Simon Pilgrim
d6778e97e0 [X86][SSE] combineBitcastvxi1 - add support for bitcasting to non-scalar integers
Truncate the movmsk scalar integer result to the equivalent scalar integer width as before but then bitcast to the requested type.

We still have the issue identified in PR41594 but D61114 should handle this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359176 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 09:34:36 +00:00
Simon Atanasyan
36cbe77a6e [MIPS] Use custom bitcast lowering to avoid excessive instructions
On Mips32r2 bitcast can be expanded to two sw instructions and an ldc1
when using bitcast i64 to double or an sdc1 and two lw instructions when
using bitcast double to i64. By introducing custom lowering that uses
mtc1/mthc1 we can avoid excessive instructions.

Patch by Mirko Brkusanin.

Differential Revision: https://reviews.llvm.org/D61069

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359171 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 07:47:28 +00:00
Alina Sbirlea
6ac87016ff Enable LoopVectorization by default.
Summary:
When refactoring vectorization flags, vectorization was disabled by default in the new pass manager.
This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults.
Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization.

Reviewers: chandlerc, jgorbe

Subscribers: jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61091

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359167 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 04:49:48 +00:00
Amy Huang
ec039c845f Recommitting r358783 and r358786 "[MS] Emit S_HEAPALLOCSITE debug info" with fixes for buildbot error (undefined assembler label).
Summary:
This emits labels around heapallocsite calls and S_HEAPALLOCSITE debug
info in codeview. Currently only changes FastISel, so emitting labels still
needs to be implemented in SelectionDAG.

Reviewers: rnk

Subscribers: aprantl, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D61083

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359149 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-24 23:02:48 +00:00
Sanjay Patel
42ee3f5798 [DAGCombiner] scale repeated FP divisor by splat factor
If we have a vector FP division with a splatted divisor, use the existing transform
that converts 'x/y' into 'x * (1.0/y)' to allow more conversions. This can then
potentially be converted into a scalar FP division by existing combines (rL358984)
as seen in the tests here.

That can be a potentially big perf difference if scalar fdiv has better timing
(including avoiding possible frequency throttling for vector ops).

Differential Revision: https://reviews.llvm.org/D61028

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359147 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-24 22:28:58 +00:00