Commit Graph

47257 Commits

Author SHA1 Message Date
Craig Topper
ccba49dfc2 [InstCombine] Teach select01 helper of foldSelectIntoOp to handle vector splats
We were handling some vectors in foldSelectIntoOp, but not if the operand of the bin op was any kind of vector constant. This patch fixes it to treat vector splats the same as scalars.

Differential Revision: https://reviews.llvm.org/D37232

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311940 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 22:00:27 +00:00
Marek Sokolowski
f2e5589b0f [llvm-rc] Add ICON and HTML parsing ability (parser, pt 2/8).
This extends the current llvm-rc parser by ICON and HTML resources.
Moreover, some tests have been slightly rewritten.

Thanks for Nico Weber for his original work in this area.

Differential Revision: https://reviews.llvm.org/D36891

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311939 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 21:59:54 +00:00
Sanjay Patel
cd4a7cd9dc [InstCombine] add tests to show failure of SimplifyDemandedVectorElts + shuffle combining; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311934 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 21:14:26 +00:00
Geoff Berry
0f7a757315 [AArch64][Falkor] Avoid generating STRQro* instructions
Summary:
STRQro* instructions are slower than the alternative ADD/STRQui expanded
instructions on Falkor, so avoid generating them unless we're optimizing
for code size.

Reviewers: t.p.northover, mcrosier

Subscribers: aemerson, rengolin, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D37020

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311931 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 20:48:43 +00:00
Davide Italiano
5e8dffb156 [LoopUnroll] Properly update loop structure in case of successful peeling.
When peeling kicks in, it updates the loop preheader.
Later, a successful full unroll of the loop needs to update a PHI
which i-th argument comes from the loop preheader, so it'd better look
at the correct block. Fixes PR33437.

Differential Revision:  https://reviews.llvm.org/D37153

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311922 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 20:29:33 +00:00
Joerg Sonnenberger
eb8f624e3d Fix ARMv4 support
ARMv4 doesn't support the "BX" instruction, which has been introduced
with ARMv4t. Adjust the call lowering and tail call implementation
accordingly.

Further changes are necessary to ensure that presence of the v4t feature
is correctly set. Most importantly, the "generic" CPU for thumb-*
triples should include ARMv4t, since thumb mode without thumb support
would naturally be pointless.

Add a couple of asserts to ensure thumb instructions are not emitted
without CPU support.

Differential Revision: https://reviews.llvm.org/D37030


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311921 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 20:20:47 +00:00
Matthias Braun
7dc0bf2675 Address r311914 review comments
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311917 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 20:11:27 +00:00
Matthias Braun
863f34946c TableGen: Fix subreg composition/concatenation
This fixes 2 problems in subregister hierarchies with multiple levels
and tuples:

1) For bigger tuples computing secondary subregs would miss 2nd order
effects.  In the test case a register like `S10_S11_S12_S13_S14` with D5
= S10_S11, D6 = S12_S13 we would correctly compute sub0 = D5, sub1 = D6
but would miss the fact that we could now form ssub0_ssub1_ssub2_ssub3
(aka sub0_sub1) = D5_D6. This is fixed by changing
computeSecondarySubRegs() to compute a fixpoint.

2) Fixing 1) exposed a problem where TableGen would create multiple
names for effectively the same subregister index. In the test case
the subregister index sub0 is composed from ssub0 and ssub1, and sub1 is
composed from ssub2 and ssub3. TableGen should not create both sub0_sub1
and ssub0_ssub1_ssub2_ssub3 as infered subregister indexes. This changes
the code to build a transitive closure of the subregister components
before forming new concatenated subregister indexes.

This fix was developed for an out of tree target. For the in-tree
targets the only change is in the register information computed for ARM.
There is a slight chance this fixed/improved some register coalescing
around the QQQQ/QQ register classes there but I couldn't see/provoke any
code generation differences.

Differential Revision: https://reviews.llvm.org/D36913

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311914 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 19:48:42 +00:00
Geoff Berry
363f0068dd [ARM] Fix bug in ARMLoadStoreOptimizer when kill flags are missing.
Summary:
ARMLoadStoreOpt::FixInvalidRegPairOp() was only checking if one of the
load destination registers to be split overlapped with the base register
if the base register was marked as killed.  Since kill flags may not
always be present, this can lead to incorrect code.

This bug was exposed by my MachineCopyPropagation change D30751 breaking
the sanitizer-x86_64-linux-android buildbot.

Also clean up some dead code and add an assert that a register offset is
never encountered by this code, since it does not handle them correctly.

Reviewers: MatzeB, qcolombet, t.p.northover

Subscribers: aemerson, javed.absar, kristof.beyls, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D37164

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311907 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 19:03:45 +00:00
Taewook Oh
2dc1928441 Create PHI node for the return value only when the return value has uses.
Summary:
Currently, a phi node is created in the normal destination to unify the return values from promoted calls and the original indirect call. This patch makes this phi node to be created only when the return value has uses.

This patch is necessary to generate valid code, as compiler crashes with the attached test case without this patch. Without this patch, an illegal phi node that has no incoming value from `entry`/`catch` is created in `cleanup` block.

I think existing implementation is good as far as there is at least one use of the original indirect call. `insertCallRetPHI` creates a new phi node in the normal destination block only when the original indirect call dominates its use and the normal destination block. Otherwise, `fixupPHINodeForNormalDest` will handle the unification of return values naturally without creating a new phi node. However, if there's no use, `insertCallRetPHI` still creates a new phi node even when the original indirect call does not dominate the normal destination block, because `getCallRetPHINode` returns false.

Reviewers: xur, davidxl, danielcdh

Reviewed By: xur

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37176

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311906 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 18:57:00 +00:00
Zachary Turner
fc50e1c612 [CodeView] Don't output S_UDT symbols for forward decls.
S_UDT symbols are the debugger's "index" for all the structs,
typedefs, classes, and enums in a program.  If any of those
structs/classes don't have a complete declaration, or if there
is a typedef to something that doesn't have a complete definition,
then emitting the S_UDT is unhelpful because it doesn't give
the debugger enough information to do anything useful.  On the
other hand, it results in a huge size blow-up in the resulting
PDB, which is exacerbated by an order of magnitude when linking
with /DEBUG:FASTLINK.

With this patch, we drop S_UDT records for types that refer either
directly or indirectly (e.g. through a typedef, pointer, etc) to
a class/struct/union/enum without a complete definition.  This
brings us about 50% of the way towards parity with /DEBUG:FASTLINK
PDBs generated from cl-compiled object files.

Differential Revision: https://reviews.llvm.org/D37162

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311904 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 18:49:04 +00:00
Stefan Pintilie
8ecaf1929f [Power9] Add new instructions for floating point status and control registers.
Added the following P9 instructions: mffsce, mffscdrn, mffscdrni, mffscrn,
  mffscrni, mffsl

Differential Revision: https://reviews.llvm.org/D37167

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311903 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 18:46:01 +00:00
Craig Topper
a3ced95cbe [InstCombine] Call hasNoSignedWrap instead of hasNoUnsignedWrap to get the NSW flag when handling Add in SimplifyDemandedUseBits.
This is a typo from r311789.

This should fix PR34349.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311902 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 18:44:28 +00:00
Krzysztof Parzyszek
a0dd08a806 [Hexagon] Check for potential bank conflicts in post-RA scheduling
Insert artificial edges between loads that could cause a cache bank
conflict.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311901 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 18:36:21 +00:00
Stanislav Mekhanoshin
9324a77aa4 [AMDGPU] Fix regression in AMDGPULibCalls allowing native for doubles
Under -cl-fast-relaxed-math we could use native_sqrt, but f64 was
allowed to produce HSAIL's nsqrt instruction. HSAIL is not here
and we stick with non-existing native_sqrt(double) as a result.

Add check for f64 to not return native functions and also remove
handling of f64 case for fold_sqrt.

Differential Revision: https://reviews.llvm.org/D37223

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311900 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 18:00:08 +00:00
Stanislav Mekhanoshin
f4dd1bdd9a [AMDGPU] computeKnownBitsForTargetNode for 24 bit mul
Differential Revision: https://reviews.llvm.org/D37168

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311896 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 16:35:37 +00:00
Craig Topper
337c2dfa0b [DAGCombiner] Teach visitEXTRACT_SUBVECTOR to turn extracts of BUILD_VECTOR into smaller BUILD_VECTORs
Only do this before operations are legalized of BUILD_VECTOR is Legal for the target.

Differential Revision: https://reviews.llvm.org/D37186

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311892 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 15:28:33 +00:00
Andrew V. Tischenko
384007e569 The current version of LLVM X86 disassembler incorrectly interprets some possible sets of x86 prefixes. This patch is the first step to close PR7709 and PR17697. There will be next patch(es) to close relative PRs.
Differential Revision: https://reviews.llvm.org/D36788

M    lib/Target/X86/Disassembler/X86DisassemblerDecoder.cpp
M    lib/Target/X86/Disassembler/X86DisassemblerDecoder.h
A    test/MC/Disassembler/X86/prefixes-i386.s
A    test/MC/Disassembler/X86/prefixes-x86_64.s
M    test/MC/Disassembler/X86/prefixes.txt


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311882 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 10:43:14 +00:00
Gadi Haber
b123fd02f9 [X86][Haswell] Updating HSW instruction scheduling information
This patch completely replaces the instruction scheduling information for the Haswell architecture target by modifying the file X86SchedHaswell.td located under the X86 Target.
We used the scheduling information retrieved from the Haswell architects in order to replace and modify the existing scheduling.
The patch continues the scheduling replacement effort started with the SNB target in r307529 and r310792.
Information includes latency, number of micro-Ops and used ports by each HSW instruction.

Please expect some performance fluctuations due to code alignment effects.

Reviewers: RKSimon, zvi, aymanmus, craig.topper, m_zuckerman, igorb, dim, chandlerc, aaboud

Differential Revision: https://reviews.llvm.org/D36663

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311879 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-28 10:04:16 +00:00
Dehao Chen
3607b8f0f2 revert r310985 which breaks for the following case:
struct string {
  ~string();
};
void f2();
void f1(int) { f2(); }
void run(int c) {
  string body;
  while (true) {
    if (c)
      f1(c);
    else
      f1(c);
  }
}

Will recommit once the issue is fixed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311864 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-27 22:22:39 +00:00
Petar Jovanovic
8679b1f292 [mips] Generate NMADD and NMSUB instructions when fneg node is present
This patch enables generation of NMADD and NMSUB instructions when fneg node
is present. These instructions are currently only generated if fsub node is
present.

Patch by Stanislav Ocovaj.

Differential Revision: https://reviews.llvm.org/D34507


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311862 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-27 21:07:24 +00:00
Craig Topper
af63a49b5d [AVX512] Add more patterns for using masked moves for subvector extracts of the lowest subvector. This time with bitcasts between the vselect and the extract.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311856 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-27 19:03:36 +00:00
Sanjay Patel
3d4355f5e5 [DAGCombiner] allow undef shuffle operands when eliminating bitcasts (PR34111)
As noted in the FIXME, this could be improved more, but this is the smallest fix
that helps:
https://bugs.llvm.org/show_bug.cgi?id=34111


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311853 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-27 17:29:30 +00:00
Sanjay Patel
0ba3b8e232 [x86] add haddps test for PR34111; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311852 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-27 17:15:49 +00:00
Ayal Zaks
a183d6cf2a [LV] Fix PR34248 - recommit D32871 after revert r311304
Original commit r311077 of D32871 was reverted in r311304 due to failures
reported in PR34248.

This recommit fixes PR34248 by restricting the packing of predicated scalars
into vectors only when vectorizing, avoiding doing so when unrolling w/o
vectorizing. Added a test derived from the reproducer of PR34248.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311849 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-27 12:55:46 +00:00
Jatin Bhateja
6e17a3e9db [X86] Adding more tests for horizontal [F]HADD/[F]SUB for AVX512 vectors types
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311847 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-27 12:43:25 +00:00
Craig Topper
395cdbc9b5 [X86] Add a target-specific DAG combine to combine extract_subvector from all zero/one build_vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311841 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-27 05:39:57 +00:00
Craig Topper
2368d9f73d [AVX512] Add patterns to match masked extract_subvector with bitcasts between the vselect and the extract_subvector. Remove the late DAG combine.
We used to do a late DAG combine to move the bitcasts out of the way, but I'm starting to think that it's better to canonicalize extract_subvector's type to match the type of its input. I've seen some cases where we've formed two different extract_subvector from the same node where one had a bitcast and the other didn't.

Add some more test cases to ensure we've also got most of the zero masking covered too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311837 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-26 22:24:57 +00:00
Jatin Bhateja
4b94f747ff [X86] Adding a test for horizontal [f]add/[f]sub for avx512 vector type 16x32.
Differential Revision: https://reviews.llvm.org/D37183

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311834 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-26 19:02:49 +00:00
Jatin Bhateja
9d2ff1a85a [DAGCombiner] Extending pattern detection for vector shuffle.
Summary:
If all the operands of a BUILD_VECTOR extract elements from same vector then split the
vector efficiently based on the maximum vector access index.

This will also fix PR 33784

Reviewers: zvi, delena, RKSimon, thakis

Reviewed By: RKSimon

Subscribers: chandlerc, eladcohen, llvm-commits

Differential Revision: https://reviews.llvm.org/D35788

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311833 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-26 19:02:36 +00:00
Jatin Bhateja
afa978fffd Revert rL311247 : To rectify commit message.
Summary: This reverts commit rL311247.

Differential Revision: https://reviews.llvm.org/D36927

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311832 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-26 19:02:17 +00:00
Daniel Berlin
7824530ee0 NewGVN: Fix PR33204 - We need to add memory users when we bypass memorydefs for loads, not just when we do it for stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311829 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-26 07:37:11 +00:00
Craig Topper
a490e81d57 [X86] Qualify the RMW INC/DEC patterns with NotSlowIncDec.
We were suppressing most uses of INC/DEC, but this one seems to have been missed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311828 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-26 06:24:25 +00:00
Petr Hosek
67092e6ede Revert "[llvm] Add symbol table support to llvm-objcopy"
This reverts commit r311826 because it's failing on llvm-i686-linux-RA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311827 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-26 03:22:25 +00:00
Petr Hosek
087e2cd838 [llvm] Add symbol table support to llvm-objcopy
This change adds support for SHT_SYMTAB sections.

Patch by Jake Ehrlich

Differential Revision: https://reviews.llvm.org/D34167

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311826 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-26 03:18:41 +00:00
Craig Topper
69d4710ed0 [AVX512] Add patterns to use masked moves to implement masked extract_subvector of the lowest subvector.
This only supports 32 and 64 bit element sizes for now. But we could probably do 16 and 8-bit elements with BWI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311821 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 23:34:59 +00:00
Craig Topper
e0707b12d1 [AVX512] Add additional test cases for masked extract subvector.
This includes tests for extracting 128-bits from a 256-bit vector and zero masking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311820 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 23:34:57 +00:00
Craig Topper
cb02fcfc43 [X86] Add patterns to show more failures to use TBM instructions when we're trying to check flags.
We can probably add patterns to fix some of them. But the ones that use 'and' as their root node emit a X86ISD::CMP node in front of the 'and' and then pattern matching that to 'test' instruction. We can't use a tablegen pattern to fix that because we can't remap the cmp result to the flag output of a TBM instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311819 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 23:34:55 +00:00
Chandler Carruth
ac54edee6f [x86] Teach the backend to fold more read-modify-write memory operands
to instructions.

These can't be reasonably matched in tablegen due to the handling of
flags, so we have to do this in C++ code. We only did it for `inc` and
`dec` historically, this starts fleshing that out to more interesting
instructions. Notably, this handles transfering operands to `add` and
`sub`.

Currently this forces them into a register. The next patch will add
support for keeping immediate operands as immediates. Then I'll extend
this beyond just `add` and `sub`.

I'm not super thrilled by the repeated switches in the code but
everything else I tried was really ugly or problematic.

Many thanks to Craig Topper for the suggestions about where to even
begin here and how to make this stuff work.

Differential Revision: https://reviews.llvm.org/D37130

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311806 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 22:50:52 +00:00
Davide Italiano
5cf0f4679c [Verifier] Diagnose invalid DIType references instead of crashing.
Fixes PR34325.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311805 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 22:08:15 +00:00
Matt Morehouse
cd698b8c34 Revert "[SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer"
This reverts r311801 due to a bot failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311803 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 22:01:21 +00:00
Matt Morehouse
8d5696051c [SanitizeCoverage] Enable stack-depth coverage for -fsanitize=fuzzer
Summary:
- Don't sanitize __sancov_lowest_stack.
- Don't instrument leaf functions.
- Add CoverageStackDepth to Fuzzer and FuzzerNoLink.

Reviewers: vitalybuka, kcc

Reviewed By: kcc

Subscribers: cfe-commits, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D37156

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311801 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 21:18:29 +00:00
Kostya Serebryany
df54667cf4 [sanitizer-coverage] extend fsanitize-coverage=pc-table with flags for every PC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311794 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 19:29:47 +00:00
Sanjay Patel
0588b413e4 [x86] regenerate checks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311793 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 19:25:03 +00:00
Craig Topper
3ec9576f37 [InstCombine] Add tests to show missed opportunities to combine bit tests hidden by a sign compare and a truncate. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311784 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 17:14:35 +00:00
Florian Hahn
f360477df5 [LoopInterchange] Skip zext instructions when looking for induction var.
Summary:
SimplifyIndVar may introduce zext instructions to widen arguments of the
loop exit check. They should not prevent us from splitting the loop at
the induction variable, but maybe the check should be more conservative,
e.g. making sure it only extends arguments used by a comparison?

Reviewers: karthikthecool, mcrosier, mzolotukhin

Reviewed By: mcrosier

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D34879

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311783 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 16:52:29 +00:00
David Green
6e7a4adfcf [gold] Fix up a new test to allow it to pass on non x86 builds.
Fix a test that is failing on a downstream ARM/AArch64
bootstrap. We just need add an elf_x86_64 parameter to
gold.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311780 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 16:14:56 +00:00
Amjad Aboud
513af851dd [InstCombine] Consider more cases where SimplifyDemandedUseBits does not convert AShr to LShr.
There are cases where AShr have better chance to be optimized than LShr, especially when the demanded bits are not known to be Zero, and also known to be similar to the sign bit.

Differential Revision: https://reviews.llvm.org/D36936




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311773 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 11:07:54 +00:00
Aditya Nandakumar
43aeabcde1 [GISel]: Implement widenScalar for Legalizing G_PHI
https://reviews.llvm.org/D37018

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311763 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 04:57:27 +00:00
Chandler Carruth
20943fdc57 [x86] NFC - normalize test case formatting of IR and generate CHECK
lines with the script rather than using manually written checks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311753 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 02:32:51 +00:00
Gor Nishanov
8970bfadd2 [coroutines] Add support for symmetric control transfer (musttail on coro.resumes followed by a suspend)
Summary:
Add musttail to any resume instructions that is immediately followed by a
suspend (i.e. ret). We do this even in -O0 to support guaranteed tail call
for symmetrical coroutine control transfer (C++ Coroutines TS extension).
This transformation is done only in the resume part of the coroutine that has
identical signature and calling convention as the coro.resume call.

Reviewers: GorNishanov

Reviewed By: GorNishanov

Subscribers: EricWF, majnemer, llvm-commits

Differential Revision: https://reviews.llvm.org/D37125

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311751 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 02:25:10 +00:00
Craig Topper
01fd7ebe93 [X86] Add TBM instructions to X86InstrInfo::isDefConvertible.
This allows us to remove "test" instructions and use the flags from the TBM instructions directly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311747 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 01:59:06 +00:00
Chandler Carruth
31f4977889 [x86] Back out one aspect of r311318: don't generically set
FeatureSlowUAMem32.

The idea was to mark things that are slow on widely available processors
as slow in the generic CPU so that the code generated for that CPU would
be fast across those processors. However, for this feature that doesn't
work out very well at all.

The problem here is that you can very easily enable AVX or AVX2 on top
of this generic CPU. For example, this can happen just by using AVX2
intrinsics from Clang within a region of code guarded by a dynamic CPU
feature test. When you do that, the generated code with SlowUAMem32 set
is ... amazingly slower. The problem is that there really aren't very
good alternatives to the unaligned loads, and so our vector codegen
regresses significantly.

The other issue is that there are plenty of AMD CPUs with AVX1 that
don't set FeatureSlowUAMem32 and so we shouldn't just check for AVX2
instead of this special feature. =/

It would be nice to have the target attriute logic be able to
enable/disable more than just one feature at a time and control this in
a more fine grained and useful way, but that doesn't seem easy. Given
that it is only Sandybridge and Ivybridge that set this feature, for now
I'm just backing it out of the generic CPU. That has the additional
advantage of going back to the previous state that people seemed vaguely
happy with.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311740 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 00:56:05 +00:00
Chandler Carruth
66f94c748f [x86] Fix an amazing goof in the handling of sub, or, and xor lowering.
The comment for this code indicated that it should work similar to our
handling of add lowering above: if we see uses of an instruction other
than flag usage and store usage, it tries to avoid the specialized
X86ISD::* nodes that are designed for flag+op modeling and emits an
explicit test.

Problem is, only the add case actually did this. In all the other cases,
the logic was incomplete and inverted. Any time the value was used by
a store, we bailed on the specialized X86ISD node. All of this appears
to have been historical where we had different logic here. =/

Turns out, we have quite a few patterns designed around these nodes. We
should actually form them. I fixed the code to match what we do for add,
and it has quite a positive effect just within some of our test cases.
The only thing close to a regression I see is using:

  notl %r
  testl %r, %r

instead of:

  xorl -1, %r

But we can add a pattern or something to fold that back out. The
improvements seem more than worth this.

I've also worked with Craig to update the comments to no longer be
actively contradicted by the code. =[ Some of this still remains
a mystery to both Craig and myself, but this seems like a large step in
the direction of consistency and slightly more accurate comments.

Many thanks to Craig for help figuring out this nasty stuff.

Differential Revision: https://reviews.llvm.org/D37096

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311737 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-25 00:34:07 +00:00
Sanjay Patel
c8f9cf9e26 [DAG] convert vector select-of-constants to logic/math
This goes back to a discussion about IR canonicalization. We'd like to preserve and convert
more IR to 'select' than we currently do because that's likely the best choice in IR:
http://lists.llvm.org/pipermail/llvm-dev/2016-September/105335.html
...but that's often not true for codegen, so we need to account for this pattern coming in
to the backend and transform it to better DAG ops.

Steps in this patch:

  1. Add an EVT param to the existing convertSelectOfConstantsToMath() TLI hook to more finely
     enable this transform. Other targets will probably want that anyway to distinguish scalars
     from vectors. We're using that here to exclude AVX512 targets, but it may not be necessary.

  2. Convert a vselect to ext+add. This eliminates a constant load/materialization, and the
     vector ext is often free.

Implementing a more general fold using xor+and can be a follow-up for targets that don't have
a legal vselect. It's also possible that we can remove the TLI hook for the special case fold
implemented here because we're eliminating a constant, but it needs to be tested on other
targets.

Differential Revision: https://reviews.llvm.org/D36840



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311731 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 23:24:43 +00:00
Mandeep Singh Grang
89c6743f22 [ADT] Enable reverse iteration for DenseMap
Reviewers: mehdi_amini, dexonsmith, dblaikie, davide, chandlerc, davidxl, echristo, efriedma

Reviewed By: dblaikie

Subscribers: rsmith, mgorny, emaste, llvm-commits

Differential Revision: https://reviews.llvm.org/D35043

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311730 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 23:02:48 +00:00
Xinliang David Li
092c93330a [Profile] backward propagate profile info in JumpThreading
Take-2 after fixing bugs in the original patch.

Differential Revsion: http://reviews.llvm.org/D36864


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311727 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 22:54:01 +00:00
Sanjay Patel
3ff3073768 [InstCombine] fix and enhance udiv/urem narrowing
There are 3 small independent changes here:

  1. Account for multiple uses in the pattern matching: avoid the transform if it increases the instruction count.
  2. Add a missing fold for the case where the numerator is the constant: http://rise4fun.com/Alive/E2p
  3. Enable all folds for vector types.

There's still one more potential change - use "shouldChangeType()" to keep from transforming to an illegal integer type.

Differential Revision: https://reviews.llvm.org/D36988


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311726 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 22:54:01 +00:00
Dehao Chen
d38687abb5 Move accurate-sample-profile into the function attribute.
Summary: We need to have accurate-sample-profile in function attribute so that it works with LTO.

Reviewers: davidxl, rsmith

Reviewed By: davidxl

Subscribers: sanjoy, mehdi_amini, javed.absar, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D37113

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311706 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 21:37:04 +00:00
Jacob Gravelle
73e192592e [WebAssembly] FastISel : Bail to SelectionDAG for constexpr calls
Summary: Currently FastISel lowers constexpr calls as indirect calls.
We'd like those to direct calls, and falling back to SelectionDAGISel
handles that.

Reviewers: dschuff, sunfish

Subscribers: jfb, sbc100, llvm-commits, aheejin

Differential Revision: https://reviews.llvm.org/D37073

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311693 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 19:53:44 +00:00
Daniel Sanders
a8273d0212 [globalisel][tablegen] Predicates should start from GIPFP_Invalid+1 not GIPFP_Invalid
This fixes a warning when there are zero defined predicates and also fixes an
unnoticed bug where the first predicate in the table was unusable.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311684 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 18:54:16 +00:00
Pete Couperus
53355ce574 [ARC] Add ARC backend.
Add the ARC backend as an experimental target to lib/Target.
Reviewed at: https://reviews.llvm.org/D36331



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311667 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 15:40:33 +00:00
Sjoerd Meijer
bfcfe1e763 [AArch64] Add FMOVH0: materialize 0 using zero register for f16 values
Instead of loading 0 from a constant pool, it's of course much better to
materialize it using an fmov and the zero register.

Thanks to Ahmed Bougacha for the suggestion.

Differential Revision: https://reviews.llvm.org/D37102


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311662 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 14:47:06 +00:00
Chad Rosier
b142bc0a90 [TargetParser][AArch64] Add support for RDM feature in the target parser.
Differential Revision: https://reviews.llvm.org/D37081

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311659 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 14:30:44 +00:00
Michael Zuckerman
9db416111e Adding base lit test for x86interleaved
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311658 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 14:11:28 +00:00
Krzysztof Parzyszek
3dfcd099f5 [Hexagon] Generate correct runtime check when recognizing memmove
The check (assuming positive stride) for validity of memmove should be
(a) the destination is at a lower address than the source, or
(b) the distance between the source and destination is greater than or
    equal the number of bytes copied.

For the second part it is sufficient to assume that the destination
is at a higher address, since the opposite case is covered by (a).
The distance calculation was previously done by subtracting the
pointers in the wrong order.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311650 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 11:59:53 +00:00
Evgeny Astigeevich
6e59618ef9 [ARM, Thumb1] Prevent ARMTargetLowering::isLegalAddressingMode from accepting illegal modes
ARMTargetLowering::isLegalAddressingMode can accept illegal addressing modes
for the Thumb1 target. This causes generation of redundant code and affects
performance.

This fixes PR34106: https://bugs.llvm.org/show_bug.cgi?id=34106

Differential Revision: https://reviews.llvm.org/D36467


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311649 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 10:00:25 +00:00
Sjoerd Meijer
9a6d31e0ca [AArch64] Custom lowering of copysign f16
This is a follow up patch of r311154 and introduces custom lowering of copysign
f16 to avoid promotions to single precision types when the subtarget supports
fullfp16.

Differential Revision: https://reviews.llvm.org/D36893


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311646 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 09:21:10 +00:00
Daniel Sanders
02ad65f1a0 Re-commit: [globalisel][tablegen] Add support for ImmLeaf without SDNodeXForm
Summary:
This patch adds support for predicates on imm nodes but only for ImmLeaf and not
for PatLeaf or PatFrag and only where the value does not need to be transformed
before being rendered into the instruction.

The limitation on PatLeaf/PatFrag/SDNodeXForm is due to differences in the
necessary target-supplied C++ for GlobalISel.

Depends on D36085

The previous commit was reverted for breaking the build but this appears to have
been the recurring problem on the Windows bots with tablegen not being re-run
when llvm-tblgen is changed but the .td's aren't. If it re-occurs then forcing a
build with clean=True should fix it but this string should do this in advance:
    Requires a clean build.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D36086



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311645 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 09:11:20 +00:00
Coby Tayree
67d905be8e [LLVM][x86][Inline Asm] support for GCC style inline asm - Y<x> constraints
This patch is intended to enable the use of basic double letter constraints used in GCC extended inline asm {Yi Y2 Yz Y0 Ym Yt}.
Supersedes D35204
Clang counterpart: D36371

Differential Revision: https://reviews.llvm.org/D36369


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311644 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 09:08:33 +00:00
Mikael Holmen
6dbfbe1563 [Reassociate] Do not drop debug location if replacement is missing
Summary:
When reassociating an expression, do not drop the instruction's
original debug location in case the replacement location is
missing.

The debug location must at least not be dropped for inlinable
callsites of debug-info-bearing functions in debug-info-bearing
functions. Failing to do so would result in an "inlinable function "
"call in a function with debug info must have a !dbg location"
error in the verifier.

As preserving the original debug location is not expected
to result in overly jumpy debug line information, it is
preserved for all other cases too.

This fixes PR34231:
https://bugs.llvm.org/show_bug.cgi?id=34231

Original patch by David Stenberg

Reviewers: davide, craig.topper, mcrosier, dblaikie, aprantl

Reviewed By: davide, aprantl

Subscribers: aprantl

Differential Revision: https://reviews.llvm.org/D36865

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311642 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 09:05:00 +00:00
Coby Tayree
89feab7412 [X86AsmParser] Refactoring, (almost) NFC.
Some refactoring to X86AsmParser, mostly regarding the way rewrites are conducted.
Mainly, we try to concentrate all the rewrite effort under one hood, so it'll hopefully be less of a mess and easier to maintain and understand.
naturally, some frontend tests were affected: D36794

Differential Revision: https://reviews.llvm.org/D36793


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311639 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 08:46:25 +00:00
Matt Arsenault
1a6aed20fa IPRA: Don't assume called function is first call operand
Fixes not finding the called global for AMDGPU
call pseudoinstructions, which prevented IPRA
from doing much.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311637 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 07:55:15 +00:00
Chandler Carruth
ea64a39482 [x86] NFC: Clean up two tests and generate precise checks for them.
Mostly this involved giving unnamed values names and running the IR
through `opt` to re-format it but merging in any important comments in
the original. I then deleted pointless comments and inlined the function
attributes for ease of reading and editting.

All of this is to make it much easier to see the instructions being
generated here and evaluate any updates to the tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311634 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 07:38:36 +00:00
Igor Breger
5b46835851 [GlobalISel][X86] Support G_IMPLICIT_DEF.
Summary: Support G_IMPLICIT_DEF.

Reviewers: zvi, guyblank, t.p.northover

Reviewed By: guyblank

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D36733

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311633 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 07:06:27 +00:00
Wei Ding
75acc65cb3 Add ‘llvm.experimental.constrained.fma‘ Intrinsic.
Differential Revision: http://reviews.llvm.org/D36335

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311629 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 04:18:24 +00:00
Daniel Berlin
9dc6eddb1d NewGVN: We weren't properly simplifying selects with equal arguments due to a thinko.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311626 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 02:43:17 +00:00
Hans Wennborg
6d2214fde6 [DAG] Fix Node Replacement in PromoteIntBinOp
When one operand is a user of another in a promoted binary operation
we may replace and delete the returned value before returning
triggering an assertion. Reorder node replacements to prevent this.

Fixes PR34137.

Landing on behalf of Nirav.

Differential Revision: https://reviews.llvm.org/D36581

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311623 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 01:08:27 +00:00
Dylan McKay
e42da35bfc [AVR] Use the correct register classes for 16-bit atomic operations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311620 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 00:14:38 +00:00
Dehao Chen
4d101c0667 Add test to cover accurate-sample-profile.
Summary: This patch adds test to cover the logic guarded by "accurate-sample-profile" flag.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: sanjoy, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D37084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311618 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 23:19:11 +00:00
Tim Northover
b97dac5226 ARM: use internal relocations for local symbols after all.
Switching to external relocations for ARM-mode branches (to allow Thumb
interworking when the offset is unencodable) causes calls to temporary symbols
to be miscompiled and instead go to the parent externally visible symbol.

Calling a temporary never happens in compiled code, but can occasionally in
hand-written assembly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311611 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 22:07:10 +00:00
Aditya Nandakumar
8bb11e0e80 Fix Verifier test - add REQUIRES aarch64-registered-target
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311609 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 21:55:36 +00:00
Adrian Prantl
a2224c91e5 Add a Verifier check for DILocation's scopes.
Found via https://bugs.llvm.org/show_bug.cgi?id=33997.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311608 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 21:52:24 +00:00
Jonas Devlieghere
e69aa182b7 [WebAssembly] Fix overflow for input with missing version
Differential revision: https://reviews.llvm.org/D37070

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311605 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 21:36:04 +00:00
Rong Xu
7996242b16 [PGO] Set edge weights for indirectbr instruction with profile counts
Current PGO only annotates the edge weight for branch and switch instructions
with profile counts. We should also annotate the indirectbr instruction as
all the information is there. This patch enables the annotating for indirectbr
instructions. Also uses this annotation in branch probability analysis.

Differential Revision: https://reviews.llvm.org/D37074


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311604 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 21:36:02 +00:00
Aditya Nandakumar
5f06407357 [GISEl]: Translate phi into G_PHI
G_PHI has the same semantics as PHI but also has types.
This lets us verify that the types in the G_PHI are consistent.
This also allows specifying legalization actions for G_PHIs.

https://reviews.llvm.org/D36990

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311596 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 20:45:48 +00:00
Reid Kleckner
a5b2af0eae Parse and print DIExpressions inline to ease IR and MIR testing
Summary:
Most DIExpressions are empty or very simple. When they are complex, they
tend to be unique, so checking them inline is reasonable.

This also avoids the need for CodeGen passes to append to the
llvm.dbg.mir named md node.

See also PR22780, for making DIExpression not be an MDNode.

Reviewers: aprantl, dexonsmith, dblaikie

Subscribers: qcolombet, javed.absar, eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37075

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311594 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 20:31:27 +00:00
Lei Huang
35d604386e Update branch coalescing to be a PowerPC specific pass
Implementing this pass as a PowerPC specific pass.  Branch coalescing utilizes
the analyzeBranch method which currently does not include any implicit operands.
This is not an issue on PPC but must be handled on other targets.

Differential Revision : https: // reviews.llvm.org/D32776

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311588 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 19:25:04 +00:00
Craig Topper
44f90d0a15 [AVX512] Don't create SHRUNKBLEND SDNodes for 512-bit vectors
There are no 512-bit blend instructions so we shouldn't create SHRUNKBLEND for them.

On a side note, it looks like there may be a missed opportunity for constant folding TESTM when LHS and RHS are equal.

This fixes PR34139.

Differential Revision: https://reviews.llvm.org/D36992

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311572 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 16:41:02 +00:00
Hans Wennborg
18b8cfa7ee LowerAtomic: Don't skip optnone functions; atomic still need lowering (PR34020)
The lowering isn't really an optimization, so optnone shouldn't make a
difference. ARM relies on the pass running when using "-mthread-model
single", because in that mode, it doesn't run AtomicExpand. See bug for
more details.

Differential Revision: https://reviews.llvm.org/D37040

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311565 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 15:43:28 +00:00
Victor Leschuk
0e75cf0e23 Revert r311546 as it breaks build
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/4394



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311560 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 15:21:10 +00:00
Gor Nishanov
1e95aaa810 [coroutines] CoroBegin from inner coroutines should be considered for spills
Summary:
If a coroutine outer calls another coroutine inner and the inner coroutine body is inlined into the outer, coro.begin from the inner coroutine should be considered for spilling if accessed across suspends.

Prior to this change, coroutine frame building code was not considering any coro.begins for spilling.
With this change, we only ignore coro.begin for the current coroutine, but, any coro.begins that were inlined into the current coroutine are eligible for spills.

Fixes PR34267

Reviewers: GorNishanov

Subscribers: qcolombet, llvm-commits, EricWF

Differential Revision: https://reviews.llvm.org/D37062

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311556 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 14:47:52 +00:00
Chad Rosier
37d17304a3 [Reassociate] Don't canonicalize x + (-Constant * y) -> x - (Constant * y)..
..if the resulting subtract will be broken up later.  This can cause us to get
into an infinite loop.

x + (-5.0 * y)      -> x - (5.0 * y)       ; Canonicalize neg const
x - (5.0 * y)       -> x + (0 - (5.0 * y)) ; Break up subtract
x + (0 - (5.0 * y)) -> x + (-5.0 * y)      ; Replace 0-X with X*-1.

PR34078

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311554 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 14:10:06 +00:00
Daniel Sanders
631020137d [globalisel][tablegen] Add support for ImmLeaf without SDNodeXForm
Summary:
This patch adds support for predicates on imm nodes but only for ImmLeaf and not for PatLeaf or PatFrag and only where the value does not need to be transformed before being rendered into the instruction.

The limitation on PatLeaf/PatFrag/SDNodeXForm is due to differences in the necessary target-supplied C++ for GlobalISel.

Depends on D36085

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, javed.absar, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D36086

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311546 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 12:14:18 +00:00
Florian Hahn
6c41152656 [ARM] Check for assembler instructions in test.
Currently this test causes test failures on some machines, due to isel not being registered. Update the test to run all passes and check emitted assembly instructions for now. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311545 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 11:53:24 +00:00
Florian Hahn
2d810c27ff [ARM] Add missing patterns for insert_subvector.
Summary: In some cases, shufflevector instruction can be transformed involving insert_subvector instructions. The ARM backend was missing some insert_subvector patterns, causing a failure during instruction selection. AArch64 has similar patterns.

Reviewers: t.p.northover, olista01, javed.absar, rengolin

Reviewed By: javed.absar

Subscribers: aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D36796

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311543 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 10:20:59 +00:00
Daniel Sanders
c2086906b5 [globalisel][tablegen] Add tests for FeatureBitsets and ComplexPattern predicates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311542 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 10:09:25 +00:00
Davide Italiano
d7a2c86855 [gold] Test we don't strip globals when producing relocatables.
lld was broken in this regard (PR33097). The gold plugin gets this
right so, no changes needed, but better adding a test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311541 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 09:43:41 +00:00
Davide Italiano
b7a48d833a [InstCombine] Fold branches with irrelevant conditions to a constant.
InstCombine folds instructions with irrelevant conditions to undef.
This, as Nuno confirmed is a bug.
(see https://bugs.llvm.org/show_bug.cgi?id=33409#c1 )

Given the original motivation for the change is that of removing an
USE, we now fold to false instead (which reaches the same goal
without undesired side effects).

Fixes PR33409.

Differential Revision:  https://reviews.llvm.org/D36975

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311540 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 09:14:37 +00:00
Hiroshi Inoue
ea638e645f [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediate
- recommitting after fixing a test failure on MacOS

On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris.
But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR).

This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate.

e.g. (x | 0xFFFFFFFF) should be

	ori 3, 3, 65535
	oris 3, 3, 65535

but LLVM generates without this patch

	li 4, 0
	oris 4, 4, 65535
	ori 4, 4, 65535
	or 3, 3, 4

Differential Revision: https://reviews.llvm.org/D34757



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311538 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 08:55:18 +00:00
Hiroshi Inoue
0722ecf05c Revert rL311526: [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediate
This reverts commit rL311526 due to failures in some buildbot.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311530 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 06:38:05 +00:00
Hiroshi Inoue
65bc8755b1 [PowerPC] better instruction selection for OR (XOR) with a 32-bit immediate
On PPC64, OR (XOR) with a 32-bit immediate can be done with only two instructions, i.e. ori + oris.
But the current LLVM generates three or four instructions for this purpose (and also it clobbers one GPR).

This patch makes PPC backend generate ori + oris (xori + xoris) for OR (XOR) with a 32-bit immediate.

e.g. (x | 0xFFFFFFFF) should be

	ori 3, 3, 65535
	oris 3, 3, 65535

but LLVM generates without this patch

	li 4, 0
	oris 4, 4, 65535
	ori 4, 4, 65535
	or 3, 3, 4

Differential Revision: https://reviews.llvm.org/D34757



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311526 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 05:15:15 +00:00
Dean Michael Berris
d5e52ea44d [XRay][CodeGen] Use PIC-friendly code in XRay sleds; remove synthetic references in .text
Summary:
This change achieves two things:

  - Redefine the Custom Event handling instrumentation points emitted by
    the compiler to not require dynamic relocation of references to the
    __xray_CustomEvent trampoline.

  - Remove the synthetic reference we emit at the end of a function that
    we used to keep auxiliary sections alive in favour of SHF_LINK_ORDER
    associated with the section where the function is defined.

To achieve the custom event handling change, we've had to introduce the
concept of sled versioning -- this will need to be supported by the
runtime to allow us to understand how to turn on/off the new version of
the custom event handling sleds. That change has to land first before we
change the way we write the sleds.

To remove the synthetic reference, we rely on a relatively new linker
feature that preserves the sections that are associated with each other.
This allows us to limit the effects on the .text section of ELF
binaries.

Because we're still using absolute references that are resolved at
runtime for the instrumentation map (and function index) maps, we mark
these sections write-able. In the future we can re-define the entries in
the map to use relative relocations instead that can be statically
determined by the linker. That change will be a bit more invasive so we
defer this for later.

Depends on D36816.

Reviewers: dblaikie, echristo, pcc

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36615

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311525 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 04:49:41 +00:00
Yonghong Song
d7276a40d8 bpf: add variants of -mcpu=# and support for additional jmp insns
-mcpu=# will support:
  . generic: the default insn set
  . v1: insn set version 1, the same as generic
  . v2: insn set version 2, version 1 + additional jmp insns
  . probe: the compiler will probe the underlying kernel to
           decide proper version of insn set.

We did not not use -mcpu=native since llc/llvm will interpret -mcpu=native
as the underlying hardware architecture regardless of -march value.

Currently, only x86_64 supports -mcpu=probe. Other architecture will
silently revert to "generic".

Also added -mcpu=help to print available cpu parameters.
llvm will print out the information only if there are at least one
cpu and at least one feature. Add an unused dummy feature to
enable the printout.

Examples for usage:
$ llc -march=bpf -mcpu=v1 -filetype=asm t.ll
$ llc -march=bpf -mcpu=v2 -filetype=asm t.ll
$ llc -march=bpf -mcpu=generic -filetype=asm t.ll
$ llc -march=bpf -mcpu=probe -filetype=asm t.ll
$ llc -march=bpf -mcpu=v3 -filetype=asm t.ll
'v3' is not a recognized processor for this target (ignoring processor)
...
$ llc -march=bpf -mcpu=help -filetype=asm t.ll
Available CPUs for this target:

  generic - Select the generic processor.
  probe   - Select the probe processor.
  v1      - Select the v1 processor.
  v2      - Select the v2 processor.

Available features for this target:

  dummy - unused feature.

Use +feature to enable a feature, or -feature to disable it.
For example, llc -mcpu=mycpu -mattr=+feature1,-feature2
...

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311522 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 04:25:57 +00:00
Matthias Braun
212ebf2492 Fix tail-merge-after-mbp test
The output of this test changed after the fix in r311520 to have
-run-pass=block-placement behave like it does in a normal pipeline.
Adjust the test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311521 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 03:49:53 +00:00
Matthias Braun
d44f02488c Add test case for r311511
This also changes the TailDuplicator to be configured explicitely
pre/post regalloc rather than relying on the isSSA() flag. This was
necessary to have `llc -run-pass` work reliably.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311520 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 03:17:59 +00:00
Craig Topper
1bc52fbec6 [InstCombine] Remove check for sext of vector icmp from shouldOptimizeCast
Looks like for 'and' and 'or' we end up performing at least some of the transformations this is bocking in a round about way anyway.

For 'and sext(cmp1), sext(cmp2) we end up later turning it into 'select cmp1, sext(cmp2), 0'. Then we optimize that back to sext (and cmp1, cmp2). This is the same result we would have gotten if shouldOptimizeCast hadn't blocked it. We do something analogous for 'or'.

With this patch we allow that transformation to happen directly in foldCastedBitwiseLogic. And we now support the same thing for 'xor'. This is definitely opening up many other cases, but since we already went around it for some cases hopefully it's ok.

Differential Revision: https://reviews.llvm.org/D36213

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311508 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 23:40:15 +00:00
Jonas Devlieghere
1eae26afe9 Revert "[llvm-dwarfdump] Print type names in DW_AT_type DIEs"
This reverts commit r311492.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311499 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 21:59:46 +00:00
Jonas Devlieghere
30abbf9835 [llvm-dwarfdump] Print type names in DW_AT_type DIEs
This patch adds printing for DW_AT_type DIEs like it's currently already
the case for DW_AT_specification DIEs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311492 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 21:41:49 +00:00
Peter Collingbourne
bb516bcc22 WholeProgramDevirt: Create bitcast to i8* at each virtual call site.
We can't reuse the llvm.assume instruction's bitcast because it may not
dominate every user of the vtable pointer.

Differential Revision: https://reviews.llvm.org/D36994

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311491 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 21:41:19 +00:00
Matt Morehouse
6008ca211c [SanitizerCoverage] Optimize stack-depth instrumentation.
Summary:
Use the initialexec TLS type and eliminate calls to the TLS
wrapper.  Fixes the sanitizer-x86_64-linux-fuzzer bot failure.

Reviewers: vitalybuka, kcc

Reviewed By: kcc

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37026

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311490 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 21:28:29 +00:00
Jakub Kuderski
5288e9123b [ADCE][Dominators] Reapply: Teach ADCE to preserve dominators
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.

This is reapplies the original patch r311057 that was reverted in r311381.
The previous version wasn't using the batch update api for updating dominators,
which in vary rare cases caused assertion failures.

This also fixes PR34258.

Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki

Reviewed By: davide

Subscribers: grandinj, zhendongsu, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D35869

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311467 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 16:30:21 +00:00
Sanjay Patel
40f6dc61c6 [x86] auto-generate full checks; NFC
I don't see anything Darwin-specific here, so I made the target generic x86-64.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311465 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 16:27:00 +00:00
Sanjay Patel
c1aac4e600 [x86] simplify runs and auto-generate full checks
I've replaced the two OS-specific runs with a generic run because
there's no functional difference in the resulting output that
we're checking. Also, the script still doesn't work with a Win
target.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311463 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 16:21:45 +00:00
Sam Parker
abb321130e [ARM][AArch64] v8.3-A Javascript Conversion
Armv8.3-A adds instructions that convert a double-precision floating
point number to a signed 32-bit integer with round towards zero,
designed for improving Javascript performance.

Differential Revision: https://reviews.llvm.org/D36785


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311448 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 11:08:21 +00:00
Renato Golin
f09cdf90c0 [ARM] Avoid creating duplicate ANDs in SelectionDAG
When expanding a BRCOND into a BR_CC, do not create an AND 1
if one already exists.

Review: D36705

Patch by Joel Galenson <jgalenson@google.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311447 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 11:02:45 +00:00
Renato Golin
0dfee36a63 [ARM] Call setBooleanContents(ZeroOrOneBooleanContent)
The ARM backend should call setBooleanContents so that it can
use known bits to make some optimizations.

Review: D35821

Patch by Joel Galenson <jgalenson@google.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311446 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 11:02:37 +00:00
George Rimar
95a4133b77 [lib/Analysis] - Mark personality functions as live.
This is PR33245.

Case I am fixing is next:
Imagine we have 2 BC files, one defines and uses personality routine,
second has only declaration and also uses it.

Previously algorithm computing dead symbols (llvm::computeDeadSymbols) did
not know about personality routines and leaved them dead even if function that
has routine was live.

As a result thinLTOInternalizeAndPromoteGUID() method changed binding for
such symbol to local. Later when LLD tried to link these objects it failed
because one object had undefined global symbol for routine and second
object contained local definition instead of global.

Patch set the live root flag on the corresponding FunctionSummary
for personality routines when we build the per-module summaries
during the compile step.

Differential revision: https://reviews.llvm.org/D36834

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311432 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 08:50:56 +00:00
Craig Topper
03d8600380 [X86] Prevent several calls to ISD::isConstantSplatVector from returning a narrower APInt than the original scalar type
ISD::isConstantSplatVector can shrink to the smallest splat width. But we don't check the size of the resulting APInt at all. This can cause us to misinterpret the results.

This patch just adds a flag to prevent the APInt from changing width.

Fixes PR34271.

Differential Revision: https://reviews.llvm.org/D36996

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311429 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 05:40:17 +00:00
Adrian Prantl
88b828e253 dsymutil: don't copy compile units without children from PCM files
rdar://problem/33830532

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311416 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 01:10:48 +00:00
Sanjay Patel
6995f18af0 [InstCombine] add udiv/urem tests with constant numerator; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311396 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 22:40:02 +00:00
Sanjay Patel
15e40f1526 [InstCombine] add more tests for udiv/urem narrowing; NFC
We don't currently limit these folds with hasOneUse() or shouldChangeType().


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311390 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 21:57:52 +00:00
Evandro Menezes
a978c65d1d [AArch64] Restore the test of conditional branch fusion
Restore the functionality of this test that was broken by
https://reviews.llvm.org/rL306144.

Differential revision: https://reviews.llvm.org/D36807

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311389 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 21:57:43 +00:00
Tim Northover
55e2d2fb65 GlobalISel (AArch64): fix ABI at border between GPRs and SP.
If a struct would end up half in GPRs and half on SP the ABI says it should
actually go entirely on the stack. We were getting this wrong in GlobalISel
before, causing compatibility issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311388 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 21:56:11 +00:00
Steven Wu
d900cd5e44 [IR] AutoUpgrade ModuleFlagBehavior for PIC and PIE level
Summary:
From r303590, ModuleFlagBehavior for PIC and PIE level is changed from
Error to Max. This will cause bitcode compatibility issue when linking
against a bitcode static archive built with old compiler.
Add an auto-ugprade path to upgrade the the ModuleFlagBehavior in the
old bitcode to match the new one so IRLinker can link them.

Reviewers: tejohnson, mehdi_amini, dexonsmith

Reviewed By: dexonsmith

Subscribers: hans, llvm-commits

Differential Revision: https://reviews.llvm.org/D36556

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311387 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 21:49:13 +00:00
Sanjoy Das
4547fffc04 Revert "Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators"
Summary: This partially reverts commit r311057 since it breaks ADCE.  See PR34258.

Reviewers: kuhar

Subscribers: mcrosier, david2050, llvm-commits

Differential Revision: https://reviews.llvm.org/D36979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311381 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 20:39:18 +00:00
Sanjay Patel
5aca549a9a [LibCallSimplifier] try harder to fold memcmp with constant arguments (2nd try)
The 1st try was reverted because it could inf-loop by creating a dead instruction.
Fixed that to not happen and added a test case to verify.

Original commit message:

Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C

Without this change, we're unnecessarily checking the alignment of the constant data,
so we miss the transform in the first 2 tests in the patch.

I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're
already trying to do that.

The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to
multiple memcmp calls, CSE can remove the duplicate instructions.

Differential Revision: https://reviews.llvm.org/D36922


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311366 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 19:13:14 +00:00
Craig Topper
bbbb2f573f [InstCombine] Teach foldSelectICmpAnd to recognize a (icmp slt X, 0) and (icmp sgt X, -1) as equivalent to an and with the sign bit of the truncated type
This is similar to what was already done in foldSelectICmpAndOr. Ultimately I'd like to see if we can call foldSelectICmpAnd from foldSelectIntoOp if we detect a power of 2 constant. This would allow us to remove foldSelectICmpAndOr entirely.

Differential Revision: https://reviews.llvm.org/D36498

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311362 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 19:02:06 +00:00
Sean Fertile
2a641f4cd1 [PPC] Refine checks for emiting TOC restore nop and tail-call eligibility.
For the medium and large code models we only need to check if a call crosses
dso-boundaries when considering tail-call elgibility.

Differential Revision: https://reviews.llvm.org/D34245

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311353 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 17:35:32 +00:00
Sanjay Patel
2e732d6a1b [InstCombine] add tests for memcmp with constant; NFC
This is the baseline (current) version of the tests that would
have been added with the transform in r311333 (reverted at 
r311340 due to inf-looping).

Adding these now to aid in testing and minimize the patch if/when
it is reinstated.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311350 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 16:47:12 +00:00
Sam Elliott
b4d267277b Emit only A Single Opt Remark When Inlining
Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33786

Reviewers: anemet, davidxl, chandlerc

Reviewed By: anemet

Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D36054

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311349 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 16:45:47 +00:00
Craig Topper
1952c98f8b [InstCombine] Fix a weakness in canEvaluateZExtd around 'and' instructions
Summary:
If the bitsToClear from the LHS of an 'and' comes back non-zero, but all of those bits are known zero on the RHS, we can reset bitsToClear.

Without this, the 'or' in the modified test case blocks the transform because it has non-zero bits in its RHS in those bits.

Reviewers: spatel, majnemer, davide

Reviewed By: davide

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36944

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311343 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 16:04:11 +00:00
Craig Topper
51f04d9893 [X86] When selecting sse_load_f32/f64 pattern, make sure there's only one use of every node all the way back to the root of the match
Summary: With masked operations, its possible for the operation node like fadd, fsub, etc. to be used by multiple different vselects. Since the pattern matching will start at the vselect, we need to make sure the operation node itself is only used once before we can fold a load. Otherwise we'll end up folding the same load into multiple instructions.

Reviewers: RKSimon, spatel, zvi, igorb

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36938

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311342 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 16:04:04 +00:00
Xinliang David Li
3ab3d94ff5 Revert 311208, 311209
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311341 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 16:00:38 +00:00
Sanjay Patel
544ac6a056 revert r311333: [LibCallSimplifier] try harder to fold memcmp with constant arguments
We're getting lots of compile-timeout bot failures like:
http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/7119
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311340 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 15:16:25 +00:00
Sanjay Patel
f4b3cc81d5 [InstCombine] add vector tests; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311339 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 15:11:39 +00:00
Zachary Turner
b6d8c58d41 [llvm-pdbutil] Add support for dumping detailed module stats.
This adds support for dumping a summary of module symbols
and CodeView debug chunks.  This option prints a table for
each module of all of the symbols that occurred in the module
and the number of times it occurred and total byte size.  Then
at the end it prints the totals for the entire file.

Additionally, this patch adds the -jmc (just my code) option,
which suppresses modules which are from external libraries or
linker imports, so that you can focus only on the object files
and libraries that originate from your own source code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311338 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 14:53:25 +00:00
Sanjay Patel
f191249bc8 [InstCombine] regenerate test checks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311337 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 14:34:06 +00:00
Sanjay Patel
fe0ed9dc7e [LibCallSimplifier] try harder to fold memcmp with constant arguments
Try to fold:
memcmp(X, C, ConstantLength) == 0 --> load X == *C

Without this change, we're unnecessarily checking the alignment of the constant data, 
so we miss the transform in the first 2 tests in the patch.

I noted this shortcoming of LibCallSimpifier in one of the recent CGP memcmp expansion 
patches. This doesn't help the example in:
https://bugs.llvm.org/show_bug.cgi?id=34032#c13
...directly, but it's worth short-circuiting more of these simple cases since we're 
already trying to do that.

The benefit of transforming to load+cmp is that existing IR analysis/transforms may
further simplify that code. For example, if the load of the variable is common to 
multiple memcmp calls, CSE can remove the duplicate instructions.

Differential Revision: https://reviews.llvm.org/D36922


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311333 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 13:55:49 +00:00
Stefan Pintilie
32d044fcf5 [PowerPC] Check if the pre-increment PHI Node already exists
Preparations to use the per-increment are sometimes done in the target
independent pass Loop Strength Reduction. We try to detect them in the PowerPC
specific pass so that they are not done twice and so that we do not add PHIs
that are not required.

Differential Revision: https://reviews.llvm.org/D36736

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311332 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 13:36:18 +00:00
Igor Breger
041b4a8eaf [GlobalISel][X86] Support G_BRCOND operation.
Summary: Support G_BRCOND operation. For now don't try to fold cmp/trunc instructions.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D34754

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311327 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 10:51:54 +00:00
Oliver Stannard
e8fad78d5a [AsmParser] Recommit: Hash is not a comment on some targets
Re-committing after r311325 fixed an unintentional use of '#' comments in
clang.

The '#' token is not a comment for all targets (on ARM and AArch64 it marks an
immediate operand), so we shouldn't treat it as such.

Comments are already converted to AsmToken::EndOfStatement by
AsmLexer::LexLineComment, so this check was unnecessary.

Differential Revision: https://reviews.llvm.org/D36405



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311326 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 09:58:37 +00:00
Igor Breger
cf90ce3e64 [GlobalISel][X86] LowerCall, for now don't handel ByValue function arguments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311321 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 08:59:59 +00:00
Michael Zuckerman
076fb389d7 [InterLeaved] Adding lit test for future work interleaved load strid 3
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311320 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 08:56:39 +00:00
Chandler Carruth
0b54cd97e1 [x86] Teach the "generic" x86 CPU to avoid patterns that are slow on
widely used processors.

This occured to me when I saw that we were generating 'inc' and 'dec'
when for Haswell and newer we shouldn't. However, there were a few "X is
slow" things that we should probably just set.

I've avoided any of the "X is fast" features because most of those would
be pretty serious regressions on processors where X isn't actually fast.
The slow things are likely to be negligible costs on processors where
these aren't slow and a significant win when they are slow.

In retrospect this seems somewhat obvious. Not sure why we didn't do
this a long time ago.

Differential Revision: https://reviews.llvm.org/D36947

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311318 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 08:45:22 +00:00
Chandler Carruth
8f31059722 [x86] Handle more cases where we can re-use an atomic operation's flags
rather than doing a separate comparison.

This both saves an explicit comparision and avoids the use of `xadd`
which introduces register constraints and other challenges to the
generated code.

The motivating case is from atomic reference counts where `1` is the
sentinel rather than `0` for whatever reason. This can and should be
lowered efficiently on x86 by just using a different flag, however the
x86 code only handled the `0` case.

There remains some further opportunities here that are currently hidden
due to canonicalization. I've included test cases that show these and
FIXMEs. However, I don't at the moment have any production use cases and
they seem substantially harder to address.

Differential Revision: https://reviews.llvm.org/D36945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311317 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 08:45:19 +00:00
Sam Parker
0472b1ccd4 [ARM][AArch64] Cortex-A75 and Cortex-A55 support
This patch introduces support for Cortex-A75 and Cortex-A55, Arm's
latest big.LITTLE A-class cores. They implement the ARMv8.2-A
architecture, including the cryptography and RAS extensions, plus
the optional dot product extension. They also implement the RCpc
AArch64 extension from ARMv8.3-A.

Cortex-A75:
https://developer.arm.com/products/processors/cortex-a/cortex-a75

Cortex-A55:
https://developer.arm.com/products/processors/cortex-a/cortex-a55

Differential Revision: https://reviews.llvm.org/D36667


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311316 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 08:43:06 +00:00
Coby Tayree
745029aaf7 [X86] Allow xacquire/xrelease prefixes
Allow those prefixes on assembly code
Differential Revision: https://reviews.llvm.org/D36845


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311309 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 07:50:15 +00:00
Craig Topper
3eef39bb8a [AVX-512] Don't change which instructions we use for unmasked subvector broadcasts when AVX512DQ is enabled.
There's no functional difference between the AVX512DQ instructions if we're not masking.

This change unifies test checks and removes extra isel entries. Similar was done for subvector insert and extracts recently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311308 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 05:29:02 +00:00
Craig Topper
3662f50ee1 [AVX512] Add 128->256 vbroadcastf64x2/vbroadcasti64x2 instructions to the EVEX->VEX table.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311307 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 05:03:28 +00:00
Dean Michael Berris
5a082a40a2 [XRay][tools] Support new kinds of instrumentation map entries
Summary:
When extracting the instrumentation map from a binary, we should be able
to recognize the new kinds of instrumentation sleds we've been emitting
with the compiler using -fxray-instrument. This change adds a test for
all the kinds of sleds we currently support (sans the tail-call sled,
which is a bit harder to force in a simple prebuilt input).

Reviewers: kpw, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36819

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311305 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 00:14:06 +00:00
Chandler Carruth
3f7e6da696 Revert r311077: [LV] Using VPlan ...
This causes LLVM to assert fail on PPC64 and crash / infloop in other
cases. Filed http://llvm.org/PR34248 with reproducer attached.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311304 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 23:17:11 +00:00
Craig Topper
b3bdcc1c1b [InstCombine] Add a test case for a weakness in canEvaluateZExtd. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311303 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 21:38:28 +00:00
Craig Topper
33ebd6e80e [AVX512] Add a test to check what happens when a load is referenced by two different masked scalar intrinsics with the same op inputs, but different masking node.
We're missing some single use checks in the sse_load_f32/f64 handling that cause us to replicate the load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311300 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 19:47:00 +00:00
Kuba Mracek
3ef5d9d5dd Fix archive-update.test after r311296.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311299 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 18:31:30 +00:00
Kuba Mracek
6965d51be7 Remove uses of "%T" from test/Object/archive-* tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311296 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 18:18:44 +00:00
Kuba Mracek
a7c3f3d69c Get rid of even more "%T" expansions, see <https://reviews.llvm.org/D35396>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311294 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 17:05:22 +00:00
Kuba Mracek
fbd10c199d Get rid of some more "%T" expansions, see <https://reviews.llvm.org/D35396>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311293 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 17:00:08 +00:00
Benjamin Kramer
efa50a2449 [MachO] Use Twines more efficiently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311291 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 15:13:39 +00:00
Elena Demikhovsky
4a05fa1f1b Changed basic cost of store operation on X86
Store operation takes 2 UOps on X86 processors. The exact cost calculation affects several optimization passes including loop unroling.
This change compensates performance degradation caused by https://reviews.llvm.org/D34458 and shows improvements on some benchmarks.

Differential Revision: https://reviews.llvm.org/D35888



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311285 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 12:34:29 +00:00
Aditya Kumar
70fb4705b4 [Loop Vectorize] Added a separate metadata
Added a separate metadata to indicate when the loop
has already been vectorized instead of setting width and count to 1.

Patch written by Divya Shanmughan and Aditya Kumar

Differential Revision: https://reviews.llvm.org/D36220

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311281 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 10:32:41 +00:00
Igor Breger
1ce5cae5ff [GlobalISel][X86] Support call ABI.
Summary: Support call ABI. For now only Linux C and X86_64_SysV calling conventions supported. Variadic function not supported.

Reviewers: zvi, guyblank, oren_ben_simhon

Reviewed By: oren_ben_simhon

Subscribers: rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D34602

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311279 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 09:25:22 +00:00
Igor Breger
4b201cee02 [GlobalISel][X86] Support asimetric copy from/to GPR physical register.
Usually this case generated by ABI lowering, it requare to performe trancate/anyext.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311278 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 07:14:40 +00:00
Sam Elliott
fe94416753 Revert "Emit only A Single Opt Remark When Inlining"
Reverting due to clang build failure

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311274 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 06:55:10 +00:00
Sam Elliott
2999c9c71d Emit only A Single Opt Remark When Inlining
Summary:
This updates the Inliner to only add a single Optimization
Remark when Inlining, rather than an Analysis Remark and an
Optimization Remark.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33786

Reviewers: anemet, davidxl, chandlerc

Reviewed By: anemet

Subscribers: haicheng, fhahn, mehdi_amini, dblaikie, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D36054

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311273 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 06:43:34 +00:00
Sam Elliott
74a34d9193 Keep Optimization Remark Yaml in NewPM
Summary:
The New Pass Manager infrastructure was forgetting to keep around the optimization remark yaml file that the compiler might have been producing. This meant setting the option to '-' for stdout worked, but setting it to a filename didn't give file output (presumably it was deleted because compilation didn't explicitly keep it). This change just ensures that the file is kept if compilation succeeds.

So far I have updated one of the optimization remark output tests to add a version with the new pass manager. It is my intention for this patch to also include changes to all tests that use `-opt-remark-output=` but I wanted to get the code patch ready for review while I was making all those changes.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33951

Reviewers: anemet, chandlerc

Reviewed By: anemet, chandlerc

Subscribers: javed.absar, chandlerc, fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D36906

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311271 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-20 01:30:45 +00:00
Chandler Carruth
e12236f216 [x86] Fix an even stranger corner case where we have multiple levels of
cmov self-refrencing.

Pointed out by Amjad Aboud in code review, test case minorly simplified
from the one he posted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311267 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 23:35:50 +00:00
Craig Topper
2d1cd3c597 [AVX512] Use alignedstore256 in a pattern that's emitting a 256-bit movaps from an extract subvector operation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311263 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 22:02:02 +00:00
Martin Storsjo
daff186997 [ARM] Check the right order for halves of VZIP/VUZP if both parts are used
This is the exact same fix as in SVN r247254. In that commit, the fix was
applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue
is present for VZIP/VUZP as well.

This fixes PR33921.

Differential Revision: https://reviews.llvm.org/D36899

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311258 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 19:47:48 +00:00
Teresa Johnson
3329070a6e Fix bot failures by requiring x86 target
The tests added in r311254 require a target triple since they are
running through code generation. Fix bot failures by requiring
an x86 target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311257 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 19:15:04 +00:00
Jatin Bhateja
9dc6615ef8 [DAGCombiner] Extending pattern detection for vector shuffle.
Summary:
    If all the operands of a BUILD_VECTOR extract elements from same vector then split the
    vector efficiently based on the maximum vector access index.

    Reviewers: zvi, delena, RKSimon, thakis

    Reviewed By: RKSimon

    Subscribers: chandlerc, eladcohen, llvm-commits

    Differential Revision: https://reviews.llvm.org/D35788

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311255 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 18:08:59 +00:00
Teresa Johnson
77be502efc [ThinLTO] Fix ThinLTO crash
Summary:
Follow up to fix in r311023, which fixed the case where the combined
index is written to disk. The same samplePGO logic exists for the
in-memory index when computing imports, so we need to filter out
GlobalVariable summaries there too.

Reviewers: davidxl

Subscribers: inglorion, llvm-commits

Differential Revision: https://reviews.llvm.org/D36919

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311254 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 18:04:25 +00:00
Jatin Bhateja
a96e1abb6f Revert rL311247 : To rectify commit message.
Summary: This reverts commit rL311247.

Differential Revision: https://reviews.llvm.org/D36927

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311252 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 17:59:58 +00:00
Jatin Bhateja
cb4206cf46 Merge branch 'arcpatch-D35788'
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311247 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 17:00:04 +00:00
Jatin Bhateja
d40ac3206e Revert rL311242 "Extension of shuffle vector pattern detection, updating post rebase."
Summary:

This reverts commit rL311242.

Differential Revision: https://reviews.llvm.org/D36924

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311246 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 16:40:06 +00:00
Jatin Bhateja
a1afcacc9f Extension of shuffle vector pattern detection, updating post rebase.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311242 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 15:58:36 +00:00
Victor Leschuk
97c7061e09 revert failing test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311238 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 12:24:41 +00:00
Victor Leschuk
f377b57c53 Add temporary test to verify that win10 builder hangs on error
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311236 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 12:02:39 +00:00
Chandler Carruth
c3557e20c3 [Inliner] Fix a nasty bug when inlining a non-recursive trace of
a function into itself.

We tried to fix this before in r306495 but that got reverted as the
assert was actually hit.

This fixes the original bug (which we seem to have lost track of with
the revert) by blocking a second remapping when the function being
inlined is also the caller and the remapping could succeed but
erroneously.

The included test case would actually load from an inlined copy of the
alloca before this change, failing to load the stored value and
miscompiling.

Many thanks to Richard Smith for diagnosing a user miscompile to this
bug, and to Kyle for the first attempt and initial analysis and David Li
for remembering the issue and how to fix it and suggesting the patch.
I'm just stitching it together and landing it. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311229 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 06:56:11 +00:00
Chandler Carruth
ff12911639 [Inliner] Clean up a test case a bit to make it more clear what is being
tested and why.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311228 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 06:06:44 +00:00
Chandler Carruth
ee26c4120d [x86] Teach the cmov converter to aggressively convert cmovs with memory
operands into control flow.

We have seen periodically performance problems with cmov where one
operand comes from memory. On modern x86 processors with strong branch
predictors and speculative execution, this tends to be much better done
with a branch than cmov. We routinely see cmov stalling while the load
is completed rather than continuing, and if there are subsequent
branches, they cannot be speculated in turn.

Also, in many (even simple) cases, macro fusion causes the control flow
version to be fewer uops.

Consider the IACA output for the initial sequence of code in a very hot
function in one of our internal benchmarks that motivates this, and notice the
micro-op reduction provided.
Before, SNB:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.20 Cycles       Throughput Bottleneck: Port1

| Num Of |              Ports pressure in cycles               |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |    |
---------------------------------------------------------------------
|   1    |           | 1.0 |           |           |     |     | CP | mov rcx, rdi
|   0*   |           |     |           |           |     |     |    | xor edi, edi
|   2^   | 0.1       | 0.6 | 0.5   0.5 | 0.5   0.5 |     | 0.4 | CP | cmp byte ptr [rsi+0xf], 0xf
|   1    |           |     | 0.5   0.5 | 0.5   0.5 |     |     |    | mov rax, qword ptr [rsi]
|   3    | 1.8       | 0.6 |           |           |     | 0.6 | CP | cmovbe rax, rdi
|   2^   |           |     | 0.5   0.5 | 0.5   0.5 |     | 1.0 |    | cmp byte ptr [rcx+0xf], 0x10
|   0F   |           |     |           |           |     |     |    | jb 0xf
Total Num Of Uops: 9
```
After, SNB:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.00 Cycles       Throughput Bottleneck: Port5

| Num Of |              Ports pressure in cycles               |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |    |
---------------------------------------------------------------------
|   1    | 0.5       | 0.5 |           |           |     |     |    | mov rax, rdi
|   0*   |           |     |           |           |     |     |    | xor edi, edi
|   2^   | 0.5       | 0.5 | 1.0   1.0 |           |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    | 0.5       | 0.5 |           |           |     |     |    | mov ecx, 0x0
|   1    |           |     |           |           |     | 1.0 | CP | jnbe 0x39
|   2^   |           |     |           | 1.0   1.0 |     | 1.0 | CP | cmp byte ptr [rax+0xf], 0x10
|   0F   |           |     |           |           |     |     |    | jnb 0x3c
Total Num Of Uops: 7
```
The difference even manifests in a throughput cycle rate difference on Haswell.
Before, HSW:
```
Throughput Analysis Report
--------------------------
Block Throughput: 2.00 Cycles       Throughput Bottleneck: FrontEnd

| Num Of |                    Ports pressure in cycles                     |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |  6  |  7  |    |
---------------------------------------------------------------------------------
|   0*   |           |     |           |           |     |     |     |     |    | mov rcx, rdi
|   0*   |           |     |           |           |     |     |     |     |    | xor edi, edi
|   2^   |           |     | 0.5   0.5 | 0.5   0.5 |     | 1.0 |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    |           |     | 0.5   0.5 | 0.5   0.5 |     |     |     |     |    | mov rax, qword ptr [rsi]
|   3    | 1.0       | 1.0 |           |           |     |     | 1.0 |     |    | cmovbe rax, rdi
|   2^   | 0.5       |     | 0.5   0.5 | 0.5   0.5 |     |     | 0.5 |     |    | cmp byte ptr [rcx+0xf], 0x10
|   0F   |           |     |           |           |     |     |     |     |    | jb 0xf
Total Num Of Uops: 8
```
After, HSW:
```
Throughput Analysis Report
--------------------------
Block Throughput: 1.50 Cycles       Throughput Bottleneck: FrontEnd

| Num Of |                    Ports pressure in cycles                     |    |
|  Uops  |  0  - DV  |  1  |  2  -  D  |  3  -  D  |  4  |  5  |  6  |  7  |    |
---------------------------------------------------------------------------------
|   0*   |           |     |           |           |     |     |     |     |    | mov rax, rdi
|   0*   |           |     |           |           |     |     |     |     |    | xor edi, edi
|   2^   |           |     | 1.0   1.0 |           |     | 1.0 |     |     |    | cmp byte ptr [rsi+0xf], 0xf
|   1    |           | 1.0 |           |           |     |     |     |     |    | mov ecx, 0x0
|   1    |           |     |           |           |     |     | 1.0 |     |    | jnbe 0x39
|   2^   | 1.0       |     |           | 1.0   1.0 |     |     |     |     |    | cmp byte ptr [rax+0xf], 0x10
|   0F   |           |     |           |           |     |     |     |     |    | jnb 0x3c
Total Num Of Uops: 6
```

Note that this cannot be usefully restricted to inner loops. Much of the
hot code we see hitting this is not in an inner loop or not in a loop at
all. The optimization still remains effective and indeed critical for
some of our code.

I have run a suite of internal benchmarks with this change. I saw a few
very significant improvements and a very few minor regressions,
but overall this change rarely has a significant effect. However, the
improvements were very significant, and in quite important routines
responsible for a great deal of our C++ CPU cycles. The gains pretty
clealy outweigh the regressions for us.

I also ran the test-suite and SPEC2006. Only 11 binaries changed at all
and none of them showed any regressions.

Amjad Aboud at Intel also ran this over their benchmarks and saw no
regressions.

Differential Revision: https://reviews.llvm.org/D36858

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311226 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 05:01:19 +00:00
Dinar Temirbulatov
484d59e444 [SLPVectorizer] Tighten up VLeft, VRight declaration, remove unnecessary testcase test/Transforms/SLPVectorizer/X86/reorder.ll, NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311223 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 03:15:07 +00:00
Dinar Temirbulatov
ef0eca1bd9 [SLPVectorizer] Add opcode parameter to reorderAltShuffleOperands, reorderInputsAccordingToOpcode functions.
Reviewers: mkuper, RKSimon, ABataev, mzolotukhin, spatel, filcab

Subscribers: llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D36766


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311221 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 02:54:20 +00:00
Adrian Prantl
96438d3760 Filter out non-constant DIGlobalVariableExpressions reachable via the CU
They won't affect the DWARF output, but they will mess with the
sorting of the fragments. This fixes the crash reported in PR34159.

https://bugs.llvm.org/show_bug.cgi?id=34159

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311217 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 01:15:06 +00:00
Eric Beckmann
79fe5367c1 llvm-mt: Merge manifest namespaces.
mt.exe performs a tree merge where certain element nodes are combined
into one.  This introduces the possibility of xml namespaces conflicting
with each other.  The original mt.exe has a hierarchy whereby certain
namespace names can override others, and nodes that would then end up in
ambigious namespaces have their namespaces explicitly defined.  This
namespace handles this merging process.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311215 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-19 00:37:41 +00:00
Xinliang David Li
6d9231092c [Profile] backward propagate profile info in JumpThreading
Differential Revsion: http://reviews.llvm.org/D36864


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311208 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 23:00:05 +00:00
Amjad Aboud
066b24cb94 [InstCombine] Teach ComputeNumSignBitsImpl to handle integer multiply instruction.
Differential Revision: https://reviews.llvm.org/D36679


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311206 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 22:56:55 +00:00
Max Kazantsev
de4770b949 [IRCE] Fix buggy behavior in Clamp
Clamp function was too optimistic when choosing signed or unsigned min/max function for calculations.
In fact, `!IsSignedPredicate` guarantees us that `Smallest` and `Greatest` can be compared safely using unsigned
predicates, but we did not check this for `S` which can in theory be negative.

This patch makes Clamp use signed min/max for cases when it fails to prove `S` being non-negative,
and it adds a test where such situation may lead to incorrect conditions calculation.

Differential Revision: https://reviews.llvm.org/D36873


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311205 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 22:50:29 +00:00
Jonas Devlieghere
15ccbc58e5 [llvm-dwarfdump] Hide .debug_str and DIE reference offsets in brief mode
This patch hides the .debug_str offset and DIE reference offsets into
the CU when llvm-dwarfdump is invoked with -brief.

Differential Revision: https://reviews.llvm.org/D36835

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311201 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 21:35:44 +00:00
Simon Pilgrim
2bd18ec173 [X86][ADX] Regenerate ADX intrinsics tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311198 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 21:21:14 +00:00
Ana Pazos
3686d78a5c [PGO] Fixed assertion due to mismatched memcpy size type.
Summary:
Memcpy intrinsics have size argument of any integer type, like i32 or i64.
Fixed size type along with its value when cloning the intrinsic.

Reviewers: davidxl, xur

Reviewed By: davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D36844

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311188 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 19:17:08 +00:00
Tim Northover
9b4ee7baf4 ARM: use an external relocation for calls from MachO ARM mode.
The internal (__text-relative) relocation risks the offset not being encodable
if the destination is Thumb.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311187 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 19:13:56 +00:00
Matt Morehouse
6dcfafe8ab [SanitizerCoverage] Add stack depth tracing instrumentation.
Summary:
Augment SanitizerCoverage to insert maximum stack depth tracing for
use by libFuzzer.  The new instrumentation is enabled by the flag
-fsanitize-coverage=stack-depth and is compatible with the existing
trace-pc-guard coverage.  The user must also declare the following
global variable in their code:
  thread_local uintptr_t __sancov_lowest_stack

https://bugs.llvm.org/show_bug.cgi?id=33857

Reviewers: vitalybuka, kcc

Reviewed By: vitalybuka

Subscribers: kubamracek, hiraditya, cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D36839

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311186 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 18:43:30 +00:00
Marek Sokolowski
0b33df9bb6 Reapply: [llvm-rc] Add basic RC scripts parsing ability.
As for now, the parser supports a limited set of statements and
resources. This will be extended in the following patches.

Thanks to Nico Weber (thakis) for his original work in this area.

This patch was originally submitted as r311175 and got reverted
in r311177 because of the problems with compilation under gcc.

Differential Revision: https://reviews.llvm.org/D36340

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311184 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 18:24:17 +00:00
Jonas Devlieghere
86286f91c5 [Debug info] Transfer DI to fragment expressions for split integer values.
This patch teaches the SDag type legalizer how to split up debug info for
integer values that are split into a hi and lo part.

(re-commit)

Differential Revision: https://reviews.llvm.org/D36805

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311181 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 18:07:00 +00:00
Marek Sokolowski
43ca59772e Revert "[llvm-rc] Add basic RC scripts parsing ability."
This reverts commit r311175.

This failed some buildbots compilation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311177 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 17:25:55 +00:00
Marek Sokolowski
c05432ec0f [llvm-rc] Add basic RC scripts parsing ability.
As for now, the parser supports a limited set of statements and
resources. This will be extended in the following patches.

Thanks to Nico Weber (thakis) for his original work in this area.

Differential Revision: https://reviews.llvm.org/D36340

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311175 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 17:05:47 +00:00
Simon Pilgrim
5b56d19e34 [X86][BMI2] Added scheduling test for RORX/SARX/SHLX/SHRX instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311171 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 16:26:39 +00:00
Simon Pilgrim
6d77959242 [X86][AES] Add scheduling latency/throughput tests for AES instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311167 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 15:26:51 +00:00
Simon Pilgrim
8b577a139a [X86][PCLMUL] Add scheduling latency/throughput test for PCLMULQDQ instruction
Added it to the SSE42 tests as targets seem to always have both

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311166 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 15:08:30 +00:00
Simon Pilgrim
b29118b50c [X86][SHA] Add scheduling latency/throughput tests for SHA instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311164 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 14:55:50 +00:00
Simon Pilgrim
0fec92d0a3 [X86][MOVBE] Add scheduling latency/throughput tests for MOVBE instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311163 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 14:44:31 +00:00
Simon Pilgrim
3362574348 [X86][BMI2] Added scheduling test for MULX instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311159 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 13:22:18 +00:00
Sjoerd Meijer
3d8decf651 [AArch64] Do not promote f16 when subtarget HasFullFP16
Armv8.2-A adds FP16 support, i.e. f16 is not only a storage-only type, but it
also supports performing data processing on 16-bit floating-point quantities.
All the necessary (tablegen) groundwork of adding the ARMv8.2-A FP16 (scalar)
instructions was done in D15014. To take advantage of this, this patch avoids
promotion of f16 to f32 types when the subtarget supports FullFP16, which
enables instruction selection of these FP16 instructions.

Differential Revision: https://reviews.llvm.org/D36396


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311154 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 10:51:14 +00:00
Diana Picus
f2ff8aa1cf Revert "GlobalISel (AArch64): fix ABI at border between GPRs and SP."
This reverts commit e8fd209647 in an
attempt to appease the GlobalISel buildbot, which fails in the
test-suite with errors like
fpcmp: files differ without tolerance allowance

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311151 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 09:31:21 +00:00
Geoff Berry
6c9f36933c Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding" round 2
This reverts commit r311135.

sanitizer-x86_64-linux-android buildbot is timing out with just this
patch applied.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311142 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 01:43:11 +00:00
Richard Smith
11110e1279 Increase tail dup threshold for -O3 from 3 to 4.
We see a modest performance improvement from this slightly higher tail dup threshold.

Differential Revision: https://reviews.llvm.org/D36775


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311139 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 23:38:41 +00:00
Craig Topper
1c92091839 [X86] Remove SSE/AVX patterns for AND/XOR/OR/ANDN that checked for the inputs being bitcasted from floating point types.
There's really no reason to do this we should just let isel pick the integer version and let the execution dependency fixing pass take care of moving to FP if necessary.

It's not very reliable to look for bitcasts at the edges of patterns. If for some reason one input was bitcasted and the other wasn't, or if one was a v4f32 bitcast and one was a v2f64 bitcast, we would have fallen back to the integer pattern anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311138 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 23:20:57 +00:00
Tim Northover
e8fd209647 GlobalISel (AArch64): fix ABI at border between GPRs and SP.
If a struct would end up half in GPRs and half on SP the ABI says it should
actually go entirely on the stack. We were getting this wrong in GlobalISel
before, causing compatibility issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311137 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 23:14:01 +00:00
Geoff Berry
d93db263e5 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Two issues identified by buildbots were addressed:
    - The pass no longer forwards COPYs to physical register uses, since
      doing so can break code that implicitly relies on the physical
      register number of the use.
    - The pass no longer forwards COPYs to undef uses, since doing so
      can break the machine verifier by creating LiveRanges that don't
      end on a use (since the undef operand is not considered a use).

    [MachineCopyPropagation] Extend pass to do COPY source forwarding

    This change extends MachineCopyPropagation to do COPY source forwarding.

    This change also extends the MachineCopyPropagation pass to be able to
    be run during register allocation, after physical registers have been
    assigned, but before the virtual registers have been re-written, which
    allows it to remove virtual register COPY LiveIntervals that become dead
    through the forwarding of all of their uses.

    Reviewers: qcolombet, javed.absar, MatzeB, jonpa

    Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

    Differential Revision: https://reviews.llvm.org/D30751

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311135 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 23:06:55 +00:00
Jonas Devlieghere
84dc1f35b1 Revert "[Debug info] Transfer DI to fragment expressions for split integer values."
This reverts commit r311102.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311111 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 17:58:33 +00:00
Alexey Bataev
752c0a0190 [SimplifyCFG] Add a test for preserve store alignment, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311106 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 17:26:52 +00:00
Sanjay Patel
6257fc9a0b [x86] add tests for vector select-of-constants; NFC
We've discussed canonicalizing to this form in IR, so the backend
should be prepared to lower these in ways better than what we see
here in most cases.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311103 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 17:07:37 +00:00
Jonas Devlieghere
6c37616078 [Debug info] Transfer DI to fragment expressions for split integer values.
This patch teaches the SDag type legalizer how to split up debug info for
integer values that are split into a hi and lo part.

Differential Revision: https://reviews.llvm.org/D36805



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311102 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 17:06:48 +00:00
Sanjay Patel
36dc99ec47 [PowerPC] add tests for vector select-of-constants; NFC
We've discussed canonicalizing to this form in IR, so the backend
should be prepared to lower these in ways better than what we see
here.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311099 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 17:03:11 +00:00
Adrian Prantl
cb8c418e34 Improve line debug info when translating a CaseBlock to SDNodes.
The SelectionDAGBuilder translates various conditional branches into
CaseBlocks which are then translated into SDNodes. If a conditional
branch results in multiple CaseBlocks only the first CaseBlock is
translated into SDNodes immediately, the rest of the CaseBlocks are
put in a queue and processed when all LLVM IR instructions in the
basic block have been processed.

When a CaseBlock is transformed into SDNodes the SelectionDAGBuilder
is queried for the current LLVM IR instruction and the resulting
SDNodes are annotated with the debug info of the current
instruction (if it exists and has debug metadata).

When the deferred CaseBlocks are processed, the SelectionDAGBuilder
does not have a current LLVM IR instruction, and the resulting SDNodes
will not have any debuginfo. As DwarfDebug::beginInstruction() outputs
a .loc directive for the first instruction in a labeled
block (typically the case for something coming from a CaseBlock) this
tends to produce a line-0 directive.

This patch changes the handling of CaseBlocks to store the current
instruction's debug info into the CaseBlock when it is created (and the
SelectionDAGBuilder knows the current instruction) and to always use
the stored debug info when translating a CaseBlock to SDNodes.

Patch by Frej Drejhammar!

Differential Revision: https://reviews.llvm.org/D36671

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311097 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 16:57:13 +00:00
Craig Topper
7feb6fc8e5 [AVX512] Don't switch unmasked subvector insert/extract instructions when AVX512DQI is enabled.
There's no reason to switch instructions with and without DQI. It just creates extra isel patterns and test divergences.

There is however value in enabling the masked version of the instructions with DQI.

This required introducing some new multiclasses to enabling this splitting.

Differential Revision: https://reviews.llvm.org/D36661

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311091 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 15:40:25 +00:00
Victor Leschuk
7f37f07d7b Mark Verifier/invalid-eh.ll as unsupported on windows
Mark this unsupported for now as it causes tests hangs on buildbot.
Will place it back when the problem is debugged.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311089 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 15:07:03 +00:00
Simon Dardis
987e30d867 [dfsan] Add explicit zero extensions for shadow parameters in function wrappers.
In the case where dfsan provides a custom wrapper for a function,
shadow parameters are added for each parameter of the function.
These parameters are i16s. For targets which do not consider this
a legal type, the lack of sign extension information would cause
LLVM to generate anyexts around their usage with phi variables
and calling convention logic.

Address this by introducing zero exts for each shadow parameter.

Reviewers: pcc, slthakur

Differential Revision: https://reviews.llvm.org/D33349


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311087 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 14:14:25 +00:00
Daniel Sanders
c3fa9e8b81 [globalisel][tablegen] Generate TypeObject table. NFC
Summary:
Generate the type table from the types used by a target rather than hard-coding
the union of types used by all targets.

Depends on D36084

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Reviewed By: rovka

Subscribers: kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D36085

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311084 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 13:18:35 +00:00
Simon Pilgrim
8f5ac0464c [DAGCombiner] Add support for non-uniform constant vectors to (mul x, (1 << c)) -> x << c
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311083 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 13:03:34 +00:00
Davide Italiano
75aa840968 [Verifier] Avoid visiting DIGlobalVariables twice.
We currently visit them twice.
Once, through `visitMDNode()` -> (the code generated by)
  `../include/llvm/IR/Metadata.def:109` -> `visitDIGlobalVariable()`
Then, through `visitMDNode()` -> `visitDIGlobalVariableExpression()`
  -> `visitDIGlobalVariable()`

This results in verification failures printed twice, e.g.:

  $ ./opt -verify ../../test/DebugInfo/pr34186.ll
  missing global variable type
  !4 = distinct !DIGlobalVariable(name: "pat", scope: !0,
    file: !1, line: 27, isLocal: true, isDefinition: true)
  missing global variable type
  !4 = distinct !DIGlobalVariable(name: "pat", scope: !0,
    file: !1, line: 27, isLocal: true, isDefinition: true)
  ./opt: ../../test/DebugInfo/pr34186.ll: error: input module is broken!

The patch removes one call so we ensure each GV is visited exactly once.

Differential Revision:  https://reviews.llvm.org/D36797

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311081 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 11:32:21 +00:00
Ayal Zaks
cd8f8f7fd4 [LV] Using VPlan to model the vectorized code and drive its transformation
VPlan is an ongoing effort to refactor and extend the Loop Vectorizer. This
patch introduces the VPlan model into LV and uses it to represent the vectorized
code and drive the generation of vectorized IR.

In this patch VPlan models the vectorized loop body: the vectorized control-flow
is represented using VPlan's Hierarchical CFG, with predication refactored from
being a post-vectorization-step into a vectorization planning step modeling
if-then VPRegionBlocks, and generating code inline with non-predicated code. The
vectorized code within each VPBasicBlock is represented as a sequence of
Recipes, each responsible for modelling and generating a sequence of IR
instructions. To keep the size of this commit manageable the Recipes in this
patch are coarse-grained and capture large chunks of LV's code-generation logic.
The constructed VPlans are dumped in dot format under -debug.

This commit retains current vectorizer output, except for minor instruction
reorderings; see associated modifications to lit tests.

For further details on the VPlan model see docs/Proposals/VectorizationPlan.rst
and its references.

Authors: Gil Rapaport and Ayal Zaks

Differential Revision: https://reviews.llvm.org/D32871


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311077 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 09:29:59 +00:00
Daniel Sanders
2cd3b1f607 Re-commit: [globalisel][tablegen] Support zero-instruction emission.
Summary:
Support the case where an operand of a pattern is also the whole of the
result pattern. In this case the original result and all its uses must be
replaced by the operand. However, register class restrictions can require
a COPY. This patch handles both cases by always emitting the copy and
leaving it for the register allocator to optimize.

The previous commit failed on Windows machines due to a flaw in the sort
predicate which allowed both A < B < C and B == C to be satisfied
simultaneously. The cause of this was some sloppiness in the priority order of
G_CONSTANT instructions compared to other instructions. These had equal priority
because it makes no difference, however there were operands had higher priority
than G_CONSTANT but lower priority than any other instruction. As a result, a
priority order between G_CONSTANT and other instructions must be enforced to
ensure the predicate defines a strict weak order.

Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar

Subscribers: javed.absar, kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D36084


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311076 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 09:26:14 +00:00
Jonas Paulsson
59bdb88371 [SystemZ, MachineScheduler] Improve post-RA scheduling.
The idea of this patch is to continue the scheduler state over an MBB boundary
in the case where the successor block has only one predecessor. This means
that the scheduler will continue in the successor block (after emitting any
branch instructions) with e.g. maintained processor resource counters.
Benchmarks have been confirmed to benefit from this.

The algorithm in MachineScheduler.cpp that extracts scheduling regions of an
MBB has been extended so that the strategy may optionally reverse the order
of processing the regions themselves. This is controlled by a new method
doMBBSchedRegionsTopDown(), which defaults to false.

Handling the top-most region of an MBB first also means that a top-down
scheduler can continue the scheduler state across any scheduling boundary
between to regions inside MBB.

Review: Ulrich Weigand, Matthias Braun, Andy Trick.
https://reviews.llvm.org/D35053

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311072 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 08:33:44 +00:00
Elad Cohen
605e60b1d2 [SelectionDAG] Teach the vector-types operand scalarizer about SETCC
When v1i1 is legal (e.g. AVX512) the legalizer can reach
a case where a v1i1 SETCC with an illgeal vector type operand
wasn't scalarized (since v1i1 is legal) but its operands does
have to be scalarized. This used to assert because SETCC was
missing from the vector operand scalarizer.

This patch attemps to teach the legalizer to handle these cases
by scalazring the operands, converting the node into a scalar
SETCC node.

Differential revision: https://reviews.llvm.org/D36651

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311071 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 08:06:36 +00:00
Serguei Katkov
a01b42e49a [CGP] Fix the rematerialization of gc.relocates
If we want to substitute the relocation of derived pointer with gep of base then
we must ensure that relocation of base dominates the relocation of derived pointer.

Currently only check for basic block is present. However it is possible that both
relocation are in the same basic block but relocation of derived pointer is defined
earlier.

The patch moves the relocation of base pointer right before relocation of derived
pointer in this case.

Reviewers: sanjoy,artagnon,igor-laevsky,reames
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36462


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311067 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 05:48:30 +00:00
Geoff Berry
a6a5be21df Revert "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
This reverts commit r311038.

Several buildbots are breaking, and at least one appears to be due to
the forwarding of physical regs enabled by this change.  Reverting while
I investigate further.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311062 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 04:04:11 +00:00
Saleem Abdulrasool
e042428b3e ARM: mark CPSR as clobbered for Windows VLAs
When lowering a VLA, we emit a __chstk call.  However, this call can
internally clobber CPSR.  We did not mark this register as an ImpDef,
which could potentially allow a comparison to be hoisted above the call
to `__chkstk`.  In such a case, the CPSR could be clobbered, and the
check invalidated.  When the support was initially added, it seemed that
the call would take care of preventing CPSR from being clobbered, but
this is not the case.  Mark the register as clobbered to fix a possible
state corruption.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311061 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 02:42:24 +00:00
Jakub Kuderski
85bef5a5c4 Reapply: [ADCE][Dominators] Teach ADCE to preserve dominators
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.

I didn't notice any performance impact when bootstrapping clang with this patch.

The patch was originally committed in r311039 and reverted in r311049.
This revision fixes the problem with not adding a dependency on the
DominatorTreeWrapperPass for the LegacyPassManager.

Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki

Reviewed By: davide

Subscribers: grandinj, zhendongsu, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D35869

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311057 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 01:41:49 +00:00
Sanjay Patel
4480e9fa9b [x86] add cmov promotion tests for D36711; NFC
This way we can see what the current codegen looks like.
I've also explicitly added/removed the cmov attribute from the RUN lines,
so we know exactly what we're checking in the runs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311052 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 22:50:11 +00:00
Amjad Aboud
58903453c3 [InstCombine] Teach canEvaluateTruncated to handle arithmetic shift (including those with vector splat shift amount)
Differential Revision: https://reviews.llvm.org/D36784


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311050 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 22:42:38 +00:00
Jakub Kuderski
7d9adf9346 Revert "[ADCE][Dominators] Teach ADCE to preserve dominators"
This reverts commit r311039. The patch caused the
`test/Bindings/OCaml/Output/scalar_opts.ml` to fail.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311049 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 22:10:53 +00:00
Craig Topper
454718f93b [InstCombine] Make folding (X >s -1) ? C1 : C2 --> ((X >>s 31) & (C2 - C1)) + C1 support splat vectors
This also uses decomposeBitTestICmp to decode the compare.

Differential Revision: https://reviews.llvm.org/D36781

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311044 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 21:52:07 +00:00
Jakub Kuderski
56c786ccab [ADCE][Dominators] Teach ADCE to preserve dominators
Summary:
This patch teaches ADCE to preserve both DominatorTrees and PostDominatorTrees.

I didn't notice any performance impact when bootstrapping clang with this patch.

Reviewers: dberlin, chandlerc, sanjoy, davide, grosser, brzycki

Reviewed By: davide

Subscribers: grandinj, zhendongsu, llvm-commits, david2050

Differential Revision: https://reviews.llvm.org/D35869

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311039 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 20:50:23 +00:00
Geoff Berry
31db6f3bd2 [MachineCopyPropagation] Extend pass to do COPY source forwarding
This change extends MachineCopyPropagation to do COPY source forwarding.

This change also extends the MachineCopyPropagation pass to be able to
be run during register allocation, after physical registers have been
assigned, but before the virtual registers have been re-written, which
allows it to remove virtual register COPY LiveIntervals that become dead
through the forwarding of all of their uses.

Reviewers: qcolombet, javed.absar, MatzeB, jonpa

Subscribers: jyknight, nemanjai, llvm-commits, nhaehnle, mcrosier, mgorny

Differential Revision: https://reviews.llvm.org/D30751

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311038 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 20:50:01 +00:00
Simon Atanasyan
10781fc2c1 [mips] Handle R_MIPS_TLS_DTPREL32/64 relocations in the RelocVisitor
Debug information for TLS variables on MIPS might have R_MIPS_TLS_DTPREL32
or R_MIPS_TLS_DTPREL64 relocations. This patch adds a support for such
relocations in the `RelocVisitor`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311031 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 19:01:22 +00:00
Xinliang David Li
2f4468d845 [PGO] Fix ThinLTO crash
Differential Revsion: http://reviews.llvm.org/D36640


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311023 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 17:18:01 +00:00
Simon Pilgrim
43d9a37996 [X86] Regenerate immediate store merging tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311016 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 16:22:19 +00:00
Hal Finkel
a62eb7baad [BDCE] Don't check demanded bits on unsized types
To clear assumptions that are potentially invalid after trivialization, we need
to walk the use/def chain. Normally, the only way to reach an instruction with
an unsized type is via an instruction that has side effects (or otherwise will
demand its input bits). That would stop the walk. However, if we have a
readnone function that returns an unsized type (e.g., void), we must avoid
asking for the demanded bits of the function call's return value. A
void-returning readnone function is always dead (and so we can stop walking the
use/def chain here), but the check is necessary to avoid asserting.

Fixes PR34211.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311014 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 16:09:22 +00:00
Davide Italiano
c79eba5730 [Verifier] Reject globals without a type associated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311012 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 15:16:33 +00:00
Dmitry Preobrazhensky
917eb1c735 [AMDGPU][MC][GFX9] Added op_sel support for v_mad_*16, v_fma_f16, v_div_fixup_f16
This change implements features postponed in https://reviews.llvm.org/D35424 because of a dependency on https://reviews.llvm.org/D36322

Reviewers: SamWot, artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D36694

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311011 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 15:16:32 +00:00
Balaram Makam
edd00a7e54 Revert "MachineInstr: Reason locally about some memory objects before going to AA."
r310825 caused the clang-ppc64le-linux-lnt bot to go red
(http://lab.llvm.org:8011/builders/clang-ppc64le-linux-lnt/builds/5712)
because of a test-suite failure of
SingleSource/UnitTests/2003-07-09-SignedArgs

This reverts commit 0028f6a872.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311008 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 14:17:43 +00:00
Dmitry Preobrazhensky
600899c871 [AMDGPU][MC][GFX9] Added integer clamping support for VOP3 opcodes
See Bug 34152: https://bugs.llvm.org//show_bug.cgi?id=34152

Reviewers: SamWot, artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D36674

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311006 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 13:51:56 +00:00
Simon Pilgrim
b148872e50 [CostModel][X86][XOP] Improve costs for XOP shuffles
VPPERM/VPERMIL2PD/VPERMIL2PS all provide more effective 2-input shuffles than regular AVX instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311005 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 13:50:20 +00:00
Davide Italiano
9c770381b9 [DI] Every DIGlobalVariable should have a type.
I'll make this a verifier check to catch other violations. This
commit fixes the tests already in tree.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311004 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 13:39:07 +00:00
Simon Dardis
c29af95cf1 [mips] Handle variables with an explicit section and interactions with .sdata, .sbss
If a variable has an explicit section such as .sdata or .sbss, it is placed
in that section and accessed in a gp relative manner. This overrides the global
-G setting.

Otherwise if a variable has a explicit section attached to it, such as '.rodata'
or '.mysection', it is not placed in the small data section. This also overrides
the global -G setting.

Reviewers: atanasyan, nitesh.jain

Differential Revision: https://reviews.llvm.org/D36616


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311001 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 12:18:04 +00:00
Sam Parker
f281371190 [ARM] Improve loop unrolling for Cortex-M
- Set the default runtime unroll count to 4 and use the newly added
  UnrollRemainder option.
- Create loop cost and force unroll for a cost less than 12.
- Disable unrolling on Thumb1 only targets.

Differential Revision: https://reviews.llvm.org/D36134


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310997 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 07:42:44 +00:00
Igor Breger
2d6d71c7e1 [GlobalISel][X86] Fix mir tests, use correct physical register.NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310996 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 07:25:51 +00:00
Martin Storsjo
4c13451fc9 [llvm-dlltool] Fix creating stdcall/fastcall import libraries for i386
Hook up the -k option (that in the original GNU dlltool removes the
@n suffix from the symbol that the final executable ends up linked to).

In llvm-dlltool, make sure that functions end up with the undecorate
name type if this option is set and they are decorated. In mingw, when
creating import libraries from def files instead of creating an import
library as a side effect of linking a DLL, the symbol names in the def
contain the stdcall/fastcall decoration (but no leading underscore).

By setting the undecorate name type, a linker linking to the import
library will omit the decoration from the DLL import entry.

With this in place, mingw-w64 for i386 built with llvm-dlltool/clang
produces import libraries that actually work.

Differential Revision: https://reviews.llvm.org/D36548

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310990 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 05:18:36 +00:00