Commit Graph

164958 Commits

Author SHA1 Message Date
Alexandros Lamprineas
832288a5c9 [InstCombine, ARM] Convert vld1 to llvm load
Convert a vector load intrinsic into an llvm load instruction.
This is beneficial when the underlying object being addressed
comes from a constant, since we get constant-folding for free.

Differential Revision: https://reviews.llvm.org/D46273

llvm-svn: 333643
2018-05-31 12:19:18 +00:00
Clement Courbet
99716a5b0e [X86] Extract latency of fldz/fld1 in separate classes.
Summary:
 - I've measured the values for Broadwell, Haswell, SandyBridge, Skylake.
 - For ZnVer1 and Atom, values were transferred form `InstRW`s.
 - For SLM and BtVer2, values are from Agner.

This is split off from https://reviews.llvm.org/D47377

Reviewers: RKSimon, andreadb

Subscribers: gbedwell, llvm-commits

Differential Revision: https://reviews.llvm.org/D47523

llvm-svn: 333642
2018-05-31 11:41:27 +00:00
Simon Pilgrim
3255d0fae6 [X86][SSE] Add support for detecting SUB(SPLAT_BV, SPLAT) cases for shift-rotate patterns.
This improves splat rotations (rotation by an uniform value), to avoid having to use the generic non-uniform shift code (extension to PR37426).

llvm-svn: 333641
2018-05-31 11:25:16 +00:00
Pavel Labath
c8fe2424ea DWARFAcceleratorTable: fix equal_range iterators
Summary:
Both (Apple and DWARF5) implementations of the iterators had bugs which
resulted in crashes if one attempted to iterate through the accelerator
tables all the way.

For the Apple tables, the issue was that we did not clear the DataOffset
field when we reached the end, which made our iterator compare unequal
to the "end" iterator. For the Dwarf5 tables, the problem was that we
incremented the CurrentIndex pointer and then used the incremented
(possibly invalid) pointer to check whether we have reached the end of
the index list.

The reason these bugs went undetected is because their only user
(dwarfdump) only ever searched for the first match. Besides allowing us
to test this fix, changing llvm-dwarfdump --find to display all matches
seems like a good improvement (it makes the behavior consistent with the
--name option), so I change llvm-dwarfdump to do that.

The existing tests would be sufficient to test this fix with the new
llvm-dwarfdump behavior, but I add a special test that demonstrates that
the tool indeed displays multiple results. The find.test test needed to
be tweaked a bit as the tool now does not print the ".debug_info
contents" header (also consistent with how --name works).

Reviewers: JDevlieghere, aprantl, dblaikie

Subscribers: mgrang, llvm-commits

Differential Revision: https://reviews.llvm.org/D47543

llvm-svn: 333635
2018-05-31 08:47:00 +00:00
Luke Geeson
1fa16feed0 [AArch64] Reverted rL333427 fixing Clang UnitTest Failure
llvm-svn: 333634
2018-05-31 08:27:53 +00:00
Max Kazantsev
d5b4e54dde [NFC] Factor out a method for further extension
llvm-svn: 333633
2018-05-31 08:08:34 +00:00
Roman Lebedev
1e6ec34d95 [llvm-exegesis][NFCI] Counter::Counter(): more useful msg on event open error
Summary:
I'm slowly looking into a new X86 scheduler model,
for AMD Bulldozer CPU, model 2 (bdver2, Piledriver).

And naturally, i have hit that assert :)
I happened to know what it meant, and how to fix it,
but that is not too common knowledge.

Reviewers: courbet, RKSimon

Reviewed By: courbet

Subscribers: tschuett, llvm-commits, craig.topper

Differential Revision: https://reviews.llvm.org/D47572

llvm-svn: 333632
2018-05-31 07:08:26 +00:00
Roman Lebedev
2fee5906e4 Revert rL333106 / D46814: [InstCombine] Fold unfolded masked merge pattern with variable mask!
In post-commit review, Eric Christopher notes that many
new MSan warnings are being observed with this patch.

The probable reason is: if 'y' is undef here and we could
evaluate it twice and get different results.
We can't increase the number of uses of a value.

llvm-svn: 333631
2018-05-31 06:00:36 +00:00
Joel E. Denny
a04841e104 [lit] Fix windows cmd.exe test config for r333620
llvm-svn: 333630
2018-05-31 05:48:33 +00:00
Stanislav Mekhanoshin
1a2c74c831 [AMDGPU] Track occupancy in MFI
Keep track of achieved occupancy in SIMachineFunctionInfo.
At the moment we have a lot of duplicated or even missed code to
query and maintain occupancy info. Record it in the MFI and
query in a single call. Interfaces:

- getOccupancy() - returns current recorded achieved occupancy.
- getMinAllowedOccupancy() - returns lesser of the achieved occupancy
and the lowest occupancy we are ready to tolerate. For example if
a kernel is memory bound we are ready to tolerate 4 waves.
- limitOccupancy() - record occupancy level if we have to lower it.
- increaseOccupancy() - record occupancy if scheduler managed to
increase the occupancy.

MFI takes care of integrating different checks affecting occupancy,
including LDS use and waves-per-eu attribute. Note that scheduler
starts with not yet known register pressure, so has to record either
limit or increase in occupancy after it is done. Later passes can
just query a resulting value.

New interface is used in the active scheduler and NFC wrt its work.
Changes are also made to experimental schedulers to use it and record
an occupancy after they are done. Before the change waves-per-eu was
ignored by experimental schedulers and tolerance window for memory
bound kernels was not used.

Differential Revision: https://reviews.llvm.org/D47509

llvm-svn: 333629
2018-05-31 05:36:04 +00:00
Jan Vesely
0ce62c1f6a AMDGPU/R600: Make sure functions are cacheline aligned
v2: use "ensureAlignment"
    make functions cache line aligned
Fixes GPU hangs since r333219:
"AMDGPU: Split R600 AsmPrinter code into its own class"

Differential Revision: https://reviews.llvm.org/D47516

llvm-svn: 333622
2018-05-31 04:08:08 +00:00
Joel E. Denny
bd389c40fd [lit] Terminate ": RUN at line N" with ";" not "&&"
This fixes projects/compiler-rt/test/fuzzer/sigusr.test, which was
broken by r333614.  The trouble was that "&&" changes the command for
which "$!" gives the pid.

llvm-svn: 333620
2018-05-31 03:40:37 +00:00
Roman Tereshin
7e5b6b8d2b [GlobalISel][Legalizer] LegalizerInfo verifier: Making LegalizerInfo::verify(...) errors fatal
Reviewers: aemerson, qcolombet

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D46339

llvm-svn: 333619
2018-05-31 01:56:07 +00:00
Roman Tereshin
041719be2f [GlobalISel][AArch64] LegalizerInfo verifier: Fixing bugs exposed by LegalizerInfo::verify(...)
Reviewers: aemerson, qcolombet

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D46339

llvm-svn: 333618
2018-05-31 01:56:05 +00:00
Joel E. Denny
3204e36001 [lit] Report line number for failed RUN command
(Relands r333584, reverted in 333592.)

When debugging test failures with -vv (or -v in the case of the
internal shell), this makes it easier to locate the RUN line that
failed.  For example, clang's test/Driver/linux-ld.c has 892 total RUN
lines, and clang's test/Driver/arm-cortex-cpus.c has 424 RUN lines
after concatenation for line continuations.

When reading the generated shell script, this also makes it easier to
locate the RUN line that produced each command.

To support reporting RUN line numbers in the case of the internal
shell, this patch extends the internal shell to support the null
command, ":", except pipelines are not supported.

To support reporting RUN line numbers in the case of windows cmd.exe
as the external shell, this patch extends -vv to set "echo on" instead
of "echo off" in bat files.  (Support for windows cmd.exe as a lit
external shell will likely be dropped later, but I found out too
late.)

Reviewed By: delcypher,	asmith, stella.stamenova, jmorse, lebedev.ri, rnk

Differential Revision: https://reviews.llvm.org/D44598

llvm-svn: 333614
2018-05-31 00:55:32 +00:00
Sanjay Patel
ad6c7a9c3a [InstCombine] don't change the size of a select if it would mismatch its condition operands' sizes
Don't always:
cast (select (cmp x, y), z, C) --> select (cmp x, y), (cast z), C'

This is something that came up as far back as D26556, and I lost track of it. 
I suspect that this transform is part of the underlying problem that is 
inspiring some of the recent proposals that seek to match larger patterns 
that include a cast op. Even if that's not true, this transform causes
problems for codegen (particularly with vector types).

A transform to actively match the size of cmp and select operand sizes should
follow. This patch just removes the harmful canonicalization in the other
direction.

Differential Revision: https://reviews.llvm.org/D47163

llvm-svn: 333611
2018-05-31 00:16:58 +00:00
Sanjay Patel
96cfd94233 [InstCombine] don't negate constant expression with fsub (PR37605)
X + (-C) would be transformed back into X - C, so infinite loop:
https://bugs.llvm.org/show_bug.cgi?id=37605

llvm-svn: 333610
2018-05-30 23:55:12 +00:00
Vedant Kumar
dc70cf34c3 [llvm-cov] Use the new PrintHTMLEscaped utility
This removes some duplicate logic to escape characters in HTML output.

llvm-svn: 333608
2018-05-30 23:35:14 +00:00
Tom Stellard
d68071370f AMDGPU: Split AMDGPUTTI into GCNTTI and R600TTI
Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D47359

llvm-svn: 333605
2018-05-30 22:55:35 +00:00
Vlad Tsyrklevich
c0190a4b57 [LowerTypeTests] Discard extern_weak linkage for definitions
Summary:
Fix PR37625. It's possible for an extern_weak declaration to be emitted
to the merged module when a definition exists in the ThinLTO portion of
the build; discard the linkage on the declaration in that case.
(otherwise we copy the linkage to the alias to the jumptable and fail)

Reviewers: pcc

Reviewed By: pcc

Subscribers: mehdi_amini, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D47494

llvm-svn: 333604
2018-05-30 22:39:52 +00:00
George Burgess IV
4bc9642da3 [NewGVN] Fix set comparison; reflow comment
Looks like we intended to compare this->Members with Other->Members
here, but ended up comparing this->Members with this->Members. Oops. :)

Since CongruenceClass::Members is a SmallPtrSet anyway, we can probably
skip building std::sets if we're willing to write a bit more code.

This appears to be no functional change (for sufficiently lax values of
"no"): this equality check was only being called inside of an assert.
So, worst case, we'll catch more bugs in the form of assertion failures.

Thanks to d0k for noting this!

llvm-svn: 333601
2018-05-30 22:24:08 +00:00
Roman Tereshin
f1db1e6ccc [GlobalISel][AArch64] LegalizerInfo verifier: Adding LegalizerInfo::verify(...) call w/o fixing bugs
This is to make it clear what kind of bugs the LegalizerInfo::verifier
is able to catch and test its output

Reviewers: aemerson, qcolombet

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D46338

llvm-svn: 333597
2018-05-30 22:10:04 +00:00
Joel E. Denny
664a77c58b Revert r333584: [lit] Report line number for failed RUN command
It breaks test-suite.

llvm-svn: 333592
2018-05-30 21:07:27 +00:00
Florian Hahn
207dd7516e [TableGen] Avoid leaking TreePatternNodes by using shared_ptr.
By using std::shared_ptr for TreePatternNode, we can avoid leaking them.

Reviewers: craig.topper, dsanders, stoklund, tstellar, zturner

Reviewed By: dsanders

Differential Revision: https://reviews.llvm.org/D47463

llvm-svn: 333591
2018-05-30 21:00:18 +00:00
Jonas Devlieghere
a044b51457 [ADT] Add unit test for PrintHTMLEscaped
Add unit tests for PrintHTMLEscaped which was added in r333565.

llvm-svn: 333590
2018-05-30 20:47:18 +00:00
Daniel Neilson
8fda4cbaa7 [IRBuilder] Add APIs for creating calls to atomic memmove and memset intrinsics. (NFC)
Summary:
Creating the IRBuilder methods:
 CreateElementUnorderedAtomicMemSet
 CreateElementUnorderedAtomicMemMove

These mirror the methods that create calls to the regular (non-atomic) memmove and
memset intrinsics.

llvm-svn: 333588
2018-05-30 20:02:56 +00:00
Simon Pilgrim
6e90843b35 Fix Wdocumentation warning. NFCI.
llvm-svn: 333586
2018-05-30 19:50:26 +00:00
Joel E. Denny
f5a15f4aa3 [lit] Report line number for failed RUN command
(Relands r330755 (reverted in r330848) with fix for PR37239.)

When debugging test failures with -vv (or -v in the case of the
internal shell), this makes it easier to locate the RUN line that
failed.  For example, clang's test/Driver/linux-ld.c has 892 total RUN
lines, and clang's test/Driver/arm-cortex-cpus.c has 424 RUN lines
after concatenation for line continuations.

When reading the generated shell script, this also makes it easier to
locate the RUN line that produced each command.

To support reporting RUN line numbers in the case of the internal
shell, this patch extends the internal shell to support the null
command, ":", except pipelines are not supported.

To support reporting RUN line numbers in the case of windows cmd.exe
as the external shell, this patch extends -vv to set "echo on" instead
of "echo off" in bat files.  (Support for windows cmd.exe as a lit
external shell will likely be dropped later, but I found out too
late.)

Reviewed By: delcypher,	asmith, stella.stamenova, jmorse, lebedev.ri, rnk

Differential Revision: https://reviews.llvm.org/D44598

llvm-svn: 333584
2018-05-30 19:42:27 +00:00
Benjamin Kramer
250ac9d0ea [CalledValuePropagation] Just use a sorted vector instead of a set.
The set properties are never used, so a vector is enough. No
functionality change intended.

While there add some std::moves to SparseSolver.

llvm-svn: 333582
2018-05-30 19:31:11 +00:00
Peter Collingbourne
33177b3e92 llvm-objcopy: Set sh_link to 0 on unrecognized symtab-linked sections.
Per discussion on the generic-abi mailing list:
https://groups.google.com/forum/#!topic/generic-abi/MPr8TVtnVn4

An object file manipulation tool must either write out a symbol
table with the same number of entries as the original symbol table
and in the same order, or if this is impossible, refuse to operate
on the object file if it has unrecognized sections that are linked
to the symtab section. However, existing tools (namely GNU strip,
GNU objcopy and ld.{bfd,gold,lld} -r) do not comply with this at
present: they change symbol table indexes and set sh_link to 0 on
the unrecognized symtab-linked sections.

We intend to use the latter as a (temporary) signal that a tool has
operated on a proposed new symtab-linked section and invalidated the
symbol table indexes. However, llvm-objcopy currently keeps sh_link
pointing to the new symtab section. This patch changes llvm-objcopy
to set sh_link to 0 to match the behaviour of the other tools.

Differential Revision: https://reviews.llvm.org/D47404

llvm-svn: 333581
2018-05-30 19:30:39 +00:00
Simon Pilgrim
693cbc43af [X86][SSE] Pulled out splat detection helper from LowerScalarVariableShift (NFCI)
Created the IsSplatValue helper from the splat detection code in LowerScalarVariableShift as a first NFC step towards improving support for splat rotations, which is an extension of PR37426.

llvm-svn: 333580
2018-05-30 19:16:59 +00:00
Galina Kistanova
39e5243c4d Reverted r333424 as it broke multiple build bots and left unfixed for a long time
llvm-svn: 333578
2018-05-30 18:51:08 +00:00
Roman Tereshin
ab142eb006 [GlobalISel][Legalizer] LegalizerInfo verifier: check rules cover type indices
This commit adds a simple verifier that tracks type indices being
touched by legalization rules' builders.

Every target will now have an opportunity to call
LegalizerInfo::verify(...) at the end of its derived LegalizerInfo's
constructor and check there are no obvious mistakes like checking only
first type for an opcode that has more than one type index and therefore
implicitly declaring any type for the second (and higher) type index
legal.

The check is only ran in assert builds and should have very minor
performance impact in assert builds and none in release builds.

This commit does not add LegalizerInfo::verify(...) calls to
target-specific legalizers, look for separate commits for that.

This commit also doesn't make the verification errors fatal, only
produces an error message, look for a later commit that does.

Reviewers: aemerson, qcolombet

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D46338

llvm-svn: 333576
2018-05-30 18:45:32 +00:00
Craig Topper
4dc11ee6f2 [X86] Update the fast-isel tests for _mm_rcp_ss, _mm_rsqrt_ss, and _mm_sqrt_ss to match clang codegen after r333572.
llvm-svn: 333573
2018-05-30 18:30:44 +00:00
Jonas Devlieghere
562ffd5cb3 [dsymutil] Escape HTML special characters in plist.
When printing string in the Plist, we weren't escaping the characters
which lead to invalid XML. This patch adds the escape logic to
StringExtras.

rdar://39785334

llvm-svn: 333565
2018-05-30 17:47:11 +00:00
Roman Tereshin
a1e354b5c2 [GlobalISel][Legalizer] NFC mostly reducing LegalizeRuleSet's methods' inter-dependecies
Making LegalizeRuleSet's implementation a little more dumb and
straightforward to make it easier to read and change, in particular in
order to add the initial version of LegalizerInfo verifier

Reviewers: aemerson, qcolombet

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D46338

llvm-svn: 333562
2018-05-30 16:54:01 +00:00
Simon Pilgrim
23c13c7c39 [X86][AVX512BW] Fixed check prefix copy+paste typo in avx512bw-intrinsics.ll
Prefix was for AVX512F instead of AVX512BW 

llvm-svn: 333560
2018-05-30 16:29:06 +00:00
Mark Searles
a5207411ca [AMDGPU][Waitcnt] Fix build error: unused variable 'SWaitInst'
https://reviews.llvm.org/rL333556 caused a buildbot failure.

See http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/21876/steps/build_Lld/logs/stdio

/Users/buildslave/as-bldslv9/lld-x86_64-darwin13/llvm.src/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:2007:10: error: unused variable 'SWaitInst' [-Werror,-Wunused-variable]
    auto SWaitInst = BuildMI(EntryBB, EntryBB.getFirstNonPHI(),

The unused variable was for debugging purposes; removing that piece of code
to fix the build.

llvm-svn: 333559
2018-05-30 16:27:57 +00:00
Matt Arsenault
635c1ae9bc AMDGPU: Use better alignment for kernarg lowering
This was just emitting loads with the ABI alignment
for the raw type. The true alignment is often better,
especially when an illegal vector type was scalarized.
The better alignment allows using a scalar load
more often.

llvm-svn: 333558
2018-05-30 16:17:51 +00:00
Karl-Johan Karlsson
229444be5b [ValueTracking] Fix endless recursion in isKnownNonZero()
Summary:
The isKnownNonZero() function have checks that abort the recursion when
it reaches the specified max depth. However one of the recursive calls
was placed before the max depth check was done, resulting in a endless
recursion that eventually triggered a segmentation fault.

Fixed the problem by moving the max depth check above the first
recursive call.

Reviewers: Prazek, nlopes, spatel, craig.topper, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, bjope, llvm-commits

Differential Revision: https://reviews.llvm.org/D47531

llvm-svn: 333557
2018-05-30 15:56:46 +00:00
Mark Searles
87f5640efb [AMDGPU][Waitcnt] Fix handling of loops with many bottom blocks
In terms of waitcnt insertion/if necessary, the waitcnt pass forces convergence
for a loop. Previously, that kicked if greater than 2 passes over a loop, which
doesn't account for loop with many bottom blocks. So, increase the threshold to
(n+1), where n is the number of bottom blocks. This gives the pass an
opportunity to consider the contribution of each bottom block, to the overall
loop, before the forced convergence potentially kicks in.

Differential Revision: https://reviews.llvm.org/D47488

llvm-svn: 333556
2018-05-30 15:47:45 +00:00
Gabor Buella
ae53d06e47 [X86] Lowering FMA intrinsics to native IR (LLVM part)
Support for Clang lowering of fused intrinsics. This patch:

1. Removes bindings to clang fma intrinsics.
2. Introduces new LLVM unmasked intrinsics with rounding mode:
     int_x86_avx512_vfmadd_pd_512
     int_x86_avx512_vfmadd_ps_512
     int_x86_avx512_vfmaddsub_pd_512
     int_x86_avx512_vfmaddsub_ps_512
     supported with a new intrinsic type (INTR_TYPE_3OP_RM).
3. Introduces new x86 fmaddsub/fmsubadd folding.
4. Introduces new tests for code emitted by sequentions introduced in Clang part.

Patch by tkrupa

Reviewers: craig.topper, sroland, spatel, RKSimon

Reviewed By: craig.topper, RKSimon

Differential Revision: https://reviews.llvm.org/D47443

llvm-svn: 333554
2018-05-30 15:25:16 +00:00
Daniel Neilson
03d48f8056 [AliasSet] Teach the alias set how to handle atomic memcpy/memmove/memset
Summary:
The atomic variants of the memcpy/memmove/memset intrinsics can be treated
the same was as the regular forms, with respect to aliasing. Update the
AliasSetTracker to treat the atomic forms the same was as the regular forms.

llvm-svn: 333551
2018-05-30 14:43:39 +00:00
Alexandros Lamprineas
6a62cf4674 [InstCombine, ARM, AArch64] Convert table lookup to shuffle vector
Turning a table lookup intrinsic into a shuffle vector instruction
can be beneficial. If the mask used for the lookup is the constant
vector {7,6,5,4,3,2,1,0}, then the back-end generates byte reverse
instructions instead.

Differential Revision: https://reviews.llvm.org/D46133

llvm-svn: 333550
2018-05-30 14:38:50 +00:00
Simon Pilgrim
ed0a658b1f [X86][AVX512] Replace -cpu=knl with -mattr=+avx512f for avx512-intrinsics tests
It was noticed on D47377 that these tests were being unnecessarily affected by scheduler changes.

This adds vzeroupper at the end of some tests as we lose the 'FeatureFastPartialYMMorZMMWrite' feature from KNL, since Skylake+ don't support this its probably better.

llvm-svn: 333549
2018-05-30 14:36:41 +00:00
Simon Pilgrim
4fed5c7dd0 [X86][SSE] Remove unnecessary -cpu from sttni tests
It was noticed on D47377 that these tests (for PR37246) were being unnecessarily affected by scheduler changes.

llvm-svn: 333546
2018-05-30 14:11:57 +00:00
Simon Pilgrim
ebabe16669 [X86][SSE] Replace -cpu with equivalent -mattr for vec_cast tests
It was noticed on D47377 that these tests were being unnecessarily affected by scheduler changes.

llvm-svn: 333545
2018-05-30 14:01:21 +00:00
Amaury Sechet
f6e6ba0be1 [ARM] Remove code handling ADDC/ADDE/SUBC/SUBE
Summary: This code is now dead as the ARM backend uses ADDCARRY/SUBCARRY/SETCCCARRY .

Reviewers: rogfer01, efriedma, rengolin, javed.absar

Subscribers: kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D47413

llvm-svn: 333544
2018-05-30 13:45:43 +00:00
Krzysztof Parzyszek
8c44cdee95 [Hexagon] Use vector align-left when shift amount fits in 3 bits
This saves an instruction because for align-right the shift amount
would need to be put in a register first.

llvm-svn: 333543
2018-05-30 13:45:34 +00:00
Simon Dardis
72cd920805 [mips] Correct the definition of CTC2/CFC2
llvm-svn: 333542
2018-05-30 13:21:13 +00:00