Commit Graph

122050 Commits

Author SHA1 Message Date
Nekotekina
99b5284463 X86: avoid vector-scalar shifts if splat amount is directly a vector ADD/SUB/AND op.
Prefer vector-vector shifts if available (AVX2+).
Improves code generated for rotate and funnel shifts.
Otherwise it would generate a shuffle + slower vector-scalar shift.
2019-04-21 22:35:31 +03:00
Nekotekina
b860b5e8f4 MCJIT: don't finalize modules on symbol lookup (workaround)
This is extremely slow yet unnecessary with manual finalization.
In LLVM 6 this wasn't a problem.
2019-04-12 21:09:06 +03:00
Nekotekina
40a92ac100 X86: add patterns for X86ISD::VSHLV and X86ISD::VSRLV
Replace VSELECT instruction which zeroes their result on exceeding legal SHL/SRL shift amount.
2019-04-12 21:09:06 +03:00
Nekotekina
1efaf6bcfb X86: add pattern for X86ISD::VSRAV
Detect clamping ashr shift amount to max legal value
2019-04-12 21:09:06 +03:00
Nekotekina
e6e78dcb3c X86: expand detectAVGPattern()
Allow all integer widths in the pattern, allow ashr
Handle signed and mixed cases, allowing to replace truncation
2019-04-12 21:09:06 +03:00
Nekotekina
38ffe4f027 X86: optimize VSELECT for v16i8 with shl + sign bit test 2019-04-12 21:09:06 +03:00
Nekotekina
1d71d6baad X86: combine inversion of VPTERNLOG 2019-04-12 21:09:06 +03:00
Nekotekina
74eeb83511 X86: LowerShift: new algorithm for vector-vector shifts
Emit pair of shifts of double size if possible
2019-04-12 21:09:06 +03:00
Nekotekina
fa826b09d1 X86: Fix/workaround Small Code Model for JIT
Force RIP-relative jump tables and global values
Force RIP-relative all zeros / all ones constants
These things were causing crashes due to use of absolute addressing
2019-04-12 20:50:09 +03:00
Brendon Cahoon
cbf9870fbd [Hexagon] Fix reuse bug in Vector Loop Carried Reuse pass
The Hexagon Vector Loop Carried Reuse pass was allowing reuse between
two shufflevectors with different masks. The reason is that the masks
are not instruction objects, so the code that checks each operand
just skipped over the operands.

This patch fixes the bug by checking if the operands are the same
when they are not instruction objects. If the objects are not the
same, then the code assumes that reuse cannot occur.

Differential Revision: https://reviews.llvm.org/D60019


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358292 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 16:37:12 +00:00
Sanjay Patel
e5a55cebc1 [DAGCombiner] narrow shuffle of concatenated vectors
// shuffle (concat X, undef), (concat Y, undef), Mask -->
// concat (shuffle X, Y, Mask0), (shuffle X, Y, Mask1)

The ARM changes with 'vtrn' and narrowed 'vuzp' are improvements.

The x86 changes look neutral or better. There's one test with an
extra instruction, but that could be reversed for a subtarget with
the right attributes. But by default, we want to avoid the 256-bit
op when possible (in my motivating benchmark, a handful of ymm ops
sprinkled into a sequence of xmm ops are triggering frequency
throttling on Haswell resulting in significantly worse perf).

Differential Revision: https://reviews.llvm.org/D60545

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358291 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 16:31:56 +00:00
Hiroshi Yamauchi
aae02cfe3c Add options for MaxLoadsPerMemcmp(OptSize).
Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60587

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358287 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 15:05:46 +00:00
Simon Pilgrim
dfff56a5b1 [X86][SSE] Recognise vXi1 boolean anyof/allof reduction patterns
Currently combineHorizontalPredicateResult only handles anyof/allof reduction patterns of legal types, which can be tricky to match as type legalization of bools can introduce bitcasts/truncs/extensions.

This patch extends combineHorizontalPredicateResult to recognise vXi1 bool reductions as well and uses the existing combineBitcastvxi1 helper to create the MOVMSK necessary to then compare the signmask result.

This ensures the accuracy of the reduction costs added in D60403 which assume the MOVMSK generation.

Differential Revision: https://reviews.llvm.org/D60610

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358286 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 14:22:57 +00:00
Hans Wennborg
0366e3e181 Revert r358268 "[DebugInfo] DW_OP_deref_size in PrologEpilogInserter."
It causes clang to crash while building Chromium. See https://crbug.com/952230
for reproducer.

> The PrologEpilogInserter need to insert a DW_OP_deref_size before
> prepending a memory location expression to an already implicit
> expression to avoid having the existing expression act on the memory
> address instead of the value behind it.
>
> The reason for using DW_OP_deref_size and not plain DW_OP_deref is that
> big-endian targets need to read the right size as simply truncating a
> larger read would yield the wrong result (LSB bytes are not at the lower
> address).
>
> Differential Revision: https://reviews.llvm.org/D59687

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358281 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 12:54:52 +00:00
Fangrui Song
e6cd4757bf Use llvm::upper_bound. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358277 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 11:31:16 +00:00
Kang Zhang
9c0652abaf [PowerPC] Add initialization for some ppc passes
Summary:

Some llc debug options need pass-name as the parameters.
But if we use the pass-name ppc-early-ret, we will get below error:
llc test.ll -stop-after ppc-early-ret
LLVM ERROR: "ppc-early-ret" pass is not registered.
Below pass-names have the pass is not registered error:
ppc-ctr-loops
ppc-ctr-loops-verify
ppc-loop-preinc-prep
ppc-toc-reg-deps
ppc-vsx-copy
ppc-early-ret
ppc-vsx-fma-mutate
ppc-vsx-swaps
ppc-reduce-cr-ops
ppc-qpx-load-splat
ppc-branch-coalescing
ppc-branch-select

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D60248



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358271 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 09:59:40 +00:00
Jeremy Morse
e049fdb84b [DebugInfo] Fix pr41175 Dead Store Elimination missing debug loc
Bug: https://bugs.llvm.org/show_bug.cgi?id=41175

In the bug test case the DSE pass is shortening the range of memory that a
memset is working on. A getelementptr is generated so that the new
starting address can be passed to memset. This instruction was not given
a DebugLoc.

To fix the bug, copy the DebugLoc from the memset instruction.

Patch by Orlando Cazalet-Hyams!

Differential Revision: https://reviews.llvm.org/D60556


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358270 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 09:47:35 +00:00
Markus Lavin
bcae242878 [DebugInfo] DW_OP_deref_size in PrologEpilogInserter.
The PrologEpilogInserter need to insert a DW_OP_deref_size before
prepending a memory location expression to an already implicit
expression to avoid having the existing expression act on the memory
address instead of the value behind it.

The reason for using DW_OP_deref_size and not plain DW_OP_deref is that
big-endian targets need to read the right size as simply truncating a
larger read would yield the wrong result (LSB bytes are not at the lower
address).

Differential Revision: https://reviews.llvm.org/D59687

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358268 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 08:23:55 +00:00
Eric Christopher
03d462db72 Move getNumFrameInfos and getDwarfFrameInfos out of line and remove
the MCDwarf.h include.

This removes 50 transitive dependencies for a modification of
MCDwarf.h in a build of llc for a pair of out of line functions
and reduces the build overhead of 'touch MCDwarf.h" by 15% without
impacting test time of check-llvm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358264 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 07:42:35 +00:00
Eric Christopher
d46ddf0061 Add explicit dependencies on MCSection.h and MCDwarf.h to the .cpp
files rather than rely on transitive includes from MCStreamer.h.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358263 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 07:40:01 +00:00
Fangrui Song
a333049e17 [ConstantFold] Don't evaluate FP or FP vector casts or truncations when simplifying icmp
Fix PR41476

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358262 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 07:34:30 +00:00
Eric Christopher
9b4c3643b3 Revert "[PowerPC] Add initialization for some ppc passes"
This reverts commit 6f8f98ce8de7c0e4ebd7fa2e1fd9507fe8d1c317 as it
is breaking nearly every bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358260 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 07:16:58 +00:00
Eric Christopher
5594c48b1c Move addInitialFrameState out of line and remove the MCDwarf.h include.
This removes 50 transitive dependencies for a modification of
MCDwarf.h in a build of llc for a single out of line function
and reduces the build overhead by 20% without impacting test
time of check-llvm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358258 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 06:57:45 +00:00
Craig Topper
3f8580a865 [TargetLowering][X86] Teach SimplifyDemandedBits to use ShrinkDemandedOp on ISD::SHL nodes.
If the upper bits of the SHL result aren't used, we might be able to use a narrower shift. For example, on X86 this can turn a 64-bit into 32-bit enabling a smaller encoding.

Differential Revision: https://reviews.llvm.org/D60358

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358257 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 06:49:28 +00:00
Kang Zhang
fe3f5c800f [PowerPC] Add initialization for some ppc passes
Summary:

Some llc debug options need pass-name as the parameters.
But if we use the pass-name ppc-early-ret, we will get below error:
llc test.ll -stop-after ppc-early-ret
LLVM ERROR: "ppc-early-ret" pass is not registered.
Below pass-names have the pass is not registered error:
ppc-ctr-loops
ppc-ctr-loops-verify
ppc-loop-preinc-prep
ppc-toc-reg-deps
ppc-vsx-copy
ppc-early-ret
ppc-vsx-fma-mutate
ppc-vsx-swaps
ppc-reduce-cr-ops
ppc-qpx-load-splat
ppc-branch-coalescing
ppc-branch-select

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D60248



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358256 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 06:35:15 +00:00
Eric Christopher
7e7105bd03 Move addFrameInst out of line and remove the MCDwarf.h include.
This removes 500 transitive dependencies for a modification of
MCDwarf.h in a build of llc for a single out of line function
and reduces the build overhead by more than half without impacting
test time of check-llvm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358255 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 06:31:59 +00:00
Eric Christopher
0c40f6d128 Include what's used in a few cpp files - these were getting transitive
includes from MCDwarf.h.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358254 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 06:16:33 +00:00
Zi Xuan Wu
5268c8af52 [PowerPC] More precise exploitation of P9 maddld instruction when operands are constant
There are 3 operands of maddld, (add (mul %1, %2), %3) and sometimes
they are constant. If there is constant operand, it takes extra li to 
materialize the operand, and one more extra register too. So it's not 
profitable to use maddld to optimize mul-add pattern.

Differential Revision: https://reviews.llvm.org/D60181


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358253 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 05:21:31 +00:00
Fangrui Song
77dd066466 MCDwarfLineTableheader::tryGetFile : replace a loop with llvm::find
Note, `DirIndex++` below is incorrect for DWARF 5, but it can be fixed
later after the file index is fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358251 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 04:55:10 +00:00
Eric Christopher
a588917c3b Move a couple of optional references to just optional to make the
forwarding APIs look similar.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358250 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 03:49:13 +00:00
Fangrui Song
c77c306949 [MC] Fix typo: .symtab_shndxr -> .symtab_shndx
This special section is named .symtab_shndx, according to gABI Chapter 4
Sections, and the name is used by some other tools. Though the section
type SHT_SYMTAB_SHNDX is what really matters, let's fix the typo
introduced in rL204769 :)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358247 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 02:16:15 +00:00
Fangrui Song
66987a8d19 Use llvm::lower_bound. NFC
This reapplies rL358161. That commit inadvertently reverted an exegesis file to an old version.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358246 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 02:02:06 +00:00
Eric Christopher
2ff866e5e8 Remove a parameter that was being passed around that we had at the
local callsite.

NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358244 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-12 01:02:02 +00:00
Nico Weber
42f5049fce llvm-undname: Use UNREACHABLE after exhaustive switch returning everywhere
No behavior change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358241 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 23:23:00 +00:00
Nico Weber
0c94d6f746 llvm-undname: Name a bool param, no behavior change
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358240 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 23:20:18 +00:00
Nico Weber
4b4fd781e2 llvm-undname: Fix out-of-bounds read on invalid intrinsic function code
Found by inspection.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358239 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 23:11:33 +00:00
Nico Weber
54ee04c709 llvm-undname: Don't crash on incomplete enum tag manglings
Found by inspection.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358238 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 22:59:25 +00:00
Nico Weber
65c0092a7b llvm-undname: Fix crash on incomplete virtual this adjusts
Found by oss-fuzz.

Also remove an else-after-return, this part has no behavior change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358237 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 22:47:18 +00:00
Nick Desaulniers
bfffca9464 [X86AsmPrinter] refactor static functions into private methods. NFC
Summary:
A lot of the code for printing special cases of operands in this
translation unit are static functions. While I too have suffered many
years of abuse at the hands of C, we should prefer private methods,
particularly when you start passing around *this as your first argument,
which is a code smell.

This will help make generic vs arch specific asm printing easier, as it
brings X86AsmPrinter more in line with other arch's derived AsmPrinters.
We will then be able to more easily move architecture generic code to
the base class, and architecture specific code to the derived classes.

Some other small refactorings while we're here:
- the parameter Op is now consistently OpNo
- add spaces around binary expressions. I know we're not millionaires
  but c'mon.

Reviewers: echristo

Reviewed By: echristo

Subscribers: smeenai, hiraditya, llvm-commits, srhines, craig.topper

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60577

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358236 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 22:47:13 +00:00
Nico Weber
0907e8aafa llvm-undname: Fix crash on invalid name in a template parameter pointer to member arg
Found by oss-fuzz.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358234 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 22:23:35 +00:00
Brendon Cahoon
0cd628a5aa [Pipeliner] Fix incorrect loop carried dependence calculation
The isLoopCarriedDep function does not correctly compute loop
carried dependences when the array index offset is negative
or the stride is smallar than the access size.

Patch by Denis Antrushin.

Differential Revision: https://reviews.llvm.org/D60135



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358233 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 21:57:51 +00:00
Nikita Popov
d6ce6ef4db [ConstantRange] Add unsignedMulMayOverflow()
Same as the other ConstantRange overflow checking methods, but for
unsigned mul. In this case there is no cheap overflow criterion, so
using umul_ov for the implementation.

Differential Revision: https://reviews.llvm.org/D60574

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358228 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 21:10:33 +00:00
Rong Xu
c2c97e63cc [PGO] Better handling of profile hash mismatch
We currently assume profile hash conflicts will be caught by an upfront
check and we assert for the cases that escape the check. The assumption
is not always true as there are chances of conflict. This patch prints
a warning and skips annotating the function for the escaped cases,.

Differential Revision: https://reviews.llvm.org/D60154



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358225 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 20:54:17 +00:00
Amara Emerson
23c2f19a50 [AArch64][GlobalISel] Flesh out vector load/store support for more types.
Some of these were legalizing into smaller vector types unnecessarily,
others were simply not supported yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358223 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 20:40:01 +00:00
Amara Emerson
a5855780b9 [AArch64][GlobalISel] Legalization and ISel support for load/stores of vectors of pointers.
Loads and store of values with type like <2 x p0> currently don't get imported
because SelectionDAG has no knowledge of pointer types. To leverage the existing
support for vector load/stores, we can bitcast the value to have s64 element
types instead. We do this as a custom legalization.

This patch also adds support for general loads of <2 x s64>, and relaxes some
type conditions on selecting G_BITCAST.

Differential Revision: https://reviews.llvm.org/D60534

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358221 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 20:32:24 +00:00
Craig Topper
0a7c2b2b16 [X86] Restrict vselect handling in scalarizeExtEltFP to only case to pre type legalization where the setcc result type is vXi1.
If the vector setcc has been legalized then we will need to convert a vector boolean of 0 or -1 to a scalar boolean of 0 or 1.

The added test case previously crashed in 32-bit mode by creating a setcc with an i64 condition that type legalization couldn't expand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358218 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 19:57:44 +00:00
Craig Topper
0bc1a86ddd [X86] Add patterns for using movss/movsd for atomic load/store of f32/64. Remove atomic fadd pseudos use isel patterns instead.
This patch adds patterns for turning bitcasted atomic load/store into movss/sd.

It also removes the pseudo instructions for atomic RMW fadd. Instead just adding isel patterns for folding an atomic load into addss/sd. And relying on the new movss/sd store pattern to handle the write part.

This also makes the fadd patterns use VEX and EVEX instructions when AVX or AVX512F are enabled.

Differential Revision: https://reviews.llvm.org/D60394

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358215 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 19:19:52 +00:00
Craig Topper
b5079b20d2 Recommit r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, but no SSE2"
With correct test checks this time.

If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integ

This matches what gcc and icc do for this case and removes an existing FIXME.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358214 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 19:19:42 +00:00
Craig Topper
d11a7fa9a6 Revert r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, but no SSE2"
I seem to have messed up the test checks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358212 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 19:04:38 +00:00
Craig Topper
27970a3895 [X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, but no SSE2
If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integer and store it to a stack temporary. From there we can do two 32-bit loads to get the value into integer registers without worrying about atomicness.

This matches what gcc and icc do for this case and removes an existing FIXME.

Differential Revision: https://reviews.llvm.org/D60156

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358211 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-11 18:40:21 +00:00