Commit Graph

28800 Commits

Author SHA1 Message Date
Jinsong Ji
f522bdf453 Set useful flags for vector imm setting instructions
Vector imm setting instructions like XXLXORz/XXLXORspz/XXLXORdpz
Should behave like LI8.

We should set corresponding flags to allow rematerialization and other
opts in LICM, RA, Scheduling etc.

Differential Revision: https://reviews.llvm.org/D58645

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355948 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 18:27:09 +00:00
Jinsong Ji
d690dcb9da [NFC][PowerPC] Update testcases using utils/update_llc_test_checks.py
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355945 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 17:55:32 +00:00
Nikita Popov
f7a652c414 [SDAG] Expand pow2 mulo using shifts
Expand MULO with constant power of two operand into a shift. The
overflow is checked with (x << shift) >> shift == x, where the right
shift will be logical for umulo and arithmetic for smulo (with
exception for multiplications by signed_min).

Differential Revision: https://reviews.llvm.org/D59041

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355937 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 16:57:25 +00:00
Simon Pilgrim
5ebb3b9e4a Regenerate sign_extend.ll test.
This will change as part of the fix for the regressions in D58017.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355933 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 16:00:59 +00:00
Tim Northover
99eb9152f9 CodeGenPrep: preserve inbounds attribute when sinking GEPs.
Targets can potentially emit more efficient code if they know address
computations never overflow. For example ILP32 code on AArch64 (which only has
64-bit address computation) can ignore the possibility of overflow with this
extra information.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355926 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 15:22:23 +00:00
Sam Parker
88f0c562f5 [ARM][NFC] Delete original smlad tests
Because I don't understand svn.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355908 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 11:06:15 +00:00
Sam Parker
a7682f4fed [ARM][NFC] Move smlad tests
Created a test/CodeGen/ARM/ParallelDSP folder.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355907 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 11:01:11 +00:00
Eugene Leviant
13df7e16ef [CGP] Fix UB when GEP is bound to trivial PHINode
Differential revision: https://reviews.llvm.org/D59140


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355904 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 10:10:29 +00:00
David Stuttard
4b43b47d8a [AMDGPU] Add support for immediate operand for S_ENDPGM
Summary:
Add support for immediate operand in S_ENDPGM

Change-Id: I0c56a076a10980f719fb2a8f16407e9c301013f6

Reviewers: alexshap

Subscribers: qcolombet, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, eraman, arphaman, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59213

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355902 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 09:52:58 +00:00
Alex Bradbury
456da09d90 [RISCV] Add test cases for the lp64 ABI
These are closely modeled on similar tests for the ilp32 ABI. Like those
tests, we group together tests that should be common cross lp64, lp64+lp64f,
and lp64+lp64f+lp64d ABIs.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355899 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 09:26:53 +00:00
Jessica Paquette
2377ecc161 Recommit "[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT"
After r355865, we should be able to safely select G_EXTRACT_VECTOR_ELT without
running into any problematic intrinsics.

Also add a fix for lane copies, which don't support index 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355871 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-11 22:18:01 +00:00
Alex Bradbury
f2cfb7b3d3 [RISCV] Do a sign-extension in a compare-and-swap of 32 bit in RV64A
AtomicCmpSwapWithSuccess is legalised into an AtomicCmpSwap plus a comparison.
This requires an extension of the value which, by default, is a
zero-extension. When we later lower AtomicCmpSwap into a PseudoCmpXchg32 and then expanded in
RISCVExpandPseudoInsts.cpp, the lr.w instruction does a sign-extension.

This mismatch of extensions causes the comparison to fail when the compared
value is negative. This change overrides TargetLowering::getExtendForAtomicOps
for RISC-V so it does a sign-extension instead.

Differential Revision: https://reviews.llvm.org/D58829
Patch by Ferran Pallarès Roca.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355869 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-11 21:41:22 +00:00
Jessica Paquette
cb9067728a [GlobalISel][AArch64] Always fall back on aarch64.neon.addp.*
Overloaded intrinsics aren't necessarily safe for instruction selection. One
such intrinsic is aarch64.neon.addp.*.

This is a temporary workaround to ensure that we always fall back on that
intrinsic. Eventually this will be replaced with a proper solution.

https://bugs.llvm.org/show_bug.cgi?id=40968

Differential Revision: https://reviews.llvm.org/D59062

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355865 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-11 20:51:17 +00:00
Nikita Popov
802a6632d5 [SDAG][AArch64] Legalize VECREDUCE
Fixes https://bugs.llvm.org/show_bug.cgi?id=36796.

Implement basic legalizations (PromoteIntRes, PromoteIntOp,
ExpandIntRes, ScalarizeVecOp, WidenVecOp) for VECREDUCE opcodes.
There are more legalizations missing (esp float legalizations),
but there's no way to test them right now, so I'm not adding them.

This also includes a few more changes to make this work somewhat
reasonably:

 * Add support for expanding VECREDUCE in SDAG. Usually
   experimental.vector.reduce is expanded prior to codegen, but if the
   target does have native vector reduce, it may of course still be
   necessary to expand due to legalization issues. This uses a shuffle
   reduction if possible, followed by a naive scalar reduction.
 * Allow the result type of integer VECREDUCE to be larger than the
   vector element type. For example we need to be able to reduce a v8i8
   into an (nominally) i32 result type on AArch64.
 * Use the vector operand type rather than the scalar result type to
   determine the action, so we can control exactly which vector types are
   supported. Also change the legalize vector op code to handle
   operations that only have vector operands, but no vector results, as
   is the case for VECREDUCE.
 * Default VECREDUCE to Expand. On AArch64 (only target using VECREDUCE),
   explicitly specify for which vector types the reductions are supported.

This does not handle anything related to VECREDUCE_STRICT_*.

Differential Revision: https://reviews.llvm.org/D58015

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355860 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-11 20:22:13 +00:00
Jonas Paulsson
e3834c1b19 [RegAlloc] Avoid compile time regression with multiple copy hints.
As a fix for https://bugs.llvm.org/show_bug.cgi?id=40986 ("excessive compile
time building opencollada"), this patch makes sure that no phys reg is hinted
more than once from getRegAllocationHints().

This handles the case were many virtual registers are assigned to the same
physreg. The previous compile time fix (r343686) in weightCalcHelper() only
made sure that physical/virtual registers are passed no more than once to
addRegAllocationHint().

Review: Dimitry Andric, Quentin Colombet
https://reviews.llvm.org/D59201

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355854 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-11 19:00:37 +00:00
Simon Pilgrim
ca9c63cece [X86] Extend widening comparison test.
Ensure we test both v2i16 unary and binary comparisons.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355849 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-11 18:08:20 +00:00
Petar Jovanovic
16c00e17ef [MIPS][microMIPS] Add a pattern to match TruncIntFP
A pattern needed to match TruncIntFP was missing. This was causing multiple
tests from llvm test suite to fail during compilation for micromips.

Patch by Mirko Brkusanin.

Differential Revision: https://reviews.llvm.org/D58722



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355825 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-11 14:13:31 +00:00
Petar Avramovic
e7f4ae297e [MIPS GlobalISel] NarrowScalar G_UMULH
NarrowScalar G_UMULH in LegalizerHelper 
using multiplyRegisters helper function.
NarrowScalar G_UMULH for MIPS32.

Differential Revision: https://reviews.llvm.org/D58825


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355815 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-11 10:08:44 +00:00
Petar Avramovic
4306b0ed95 [MIPS GlobalISel] NarrowScalar G_MUL
Narrow Scalar G_MUL for MIPS32.
Revisit NarrowScalar implementation in LegalizerHelper.
Introduce new helper function multiplyRegisters.
It performs generic multiplication of values held in multiple registers.
Generated instructions use only types NarrowTy and i1.
Destination can be same or two times size of the source.

Differential Revision: https://reviews.llvm.org/D58824


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355814 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-11 10:00:17 +00:00
Craig Topper
6c46962524 [X86] Enable sse2_cvtsd2ss intrinsic to use an EVEX encoded instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355810 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-11 06:01:04 +00:00
Amaury Sechet
5ddf9b4b32 Add test case for add to sub post legalization. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355797 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-11 01:25:48 +00:00
Sanjay Patel
6a6735dc58 [x86] add x86-specific opcodes to extractelement scalarization list
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355792 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-10 18:56:21 +00:00
Craig Topper
385eaa67a3 [X86] Make lowering of intrinsics with rounding mode stricter so that only valid rounding modes are lowered. Update tests accordingly
Many of our tests were not using valid rounding mode immediates. Clang verifies this in the frontend when it creates the intrinsics from builtins, but the backend would still lower invalid immediates.

With this change we will now leave them as intrinsics if the immediate is invalid. This will cause an isel selection failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355789 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-10 17:20:45 +00:00
Nikita Popov
47bbaefec4 [AArch64] Add tests for saddsat/ssubsat; NFC
Signed versions of the existing unsigned tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355787 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-10 12:21:36 +00:00
Nikita Popov
86f7633c4d [ARM] Use non-constant operand in umulo-32.ll; NFC
Currently the store+load is folded and both operands of the umulo
end up being constants. To avoid this getting folded away entirely,
make sure at least one operand is non-constant.

Also remove some allocas which don't seem relevant to the test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355776 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-09 13:43:21 +00:00
Nikita Popov
79fd81910e [ARM] Generate test checks for umulo-32.ll; NFC
The second test case is going to be changed by D59041, so generate
full baseline checks.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355775 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-09 13:21:15 +00:00
Alex Bradbury
ee8f60f78f [RISCV] Support -target-abi at the MC layer and for codegen
This patch adds proper handling of -target-abi, as accepted by llvm-mc and
llc. Lowering (codegen) for the hard-float ABIs will follow in a subsequent
patch. However, this patch does add MC layer support for the hard float and
RVE ABIs (emission of the appropriate ELF flags
https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md#-file-header).

ABI parsing must be shared between codegen and the MC layer, so we add
computeTargetABI to RISCVUtils. A warning will be printed if an invalid or
unrecognized ABI is given.

Differential Revision: https://reviews.llvm.org/D59023



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355771 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-09 09:28:06 +00:00
Thomas Lively
aca9dda712 [WebAssembly] Use named operands to identify loads and stores
Summary:
Uses the named operands tablegen feature to look up the indices of
offset, address, and p2align operands for all load and store
instructions. This replaces brittle, incorrect logic for identifying
loads and store when eliminating frame indices, which previously
crashed on bulk-memory ops. It also cleans up the SetP2Alignment pass.

Reviewers: aheejin, dschuff

Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59007

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355770 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-09 04:31:37 +00:00
Sanjay Patel
e7e3dde903 [x86] add tests for extract of FP select; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355768 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-09 02:11:05 +00:00
Craig Topper
7c2c4642b9 [ScalarizeMaskedMemIntrin] Use IRBuilder functions that take uint32_t/uint64_t for getelementptr, extractelement, and insertelement.
This saves needing to call getInt32 ourselves. Making the code a little shorter.

The test changes are because insert/extract use getInt64 internally. Shouldn't be a functional issue.

This cleanup because I plan to write similar code for expandload/compressstore.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355767 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-09 02:08:41 +00:00
Amara Emerson
a8b5f0e7e3 [AArch64][GlobalISel] Fix i1 arguments not being zero-extended as required by ABI.
Fixes PR41001.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355745 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 22:17:00 +00:00
Sanjay Patel
f033627be4 [x86] scalarize extract element 0 of FP cmp
An extension of D58282 noted in PR39665:
https://bugs.llvm.org/show_bug.cgi?id=39665

This doesn't answer the request to use movmsk, but that's an
independent problem. We need this and probably still need
scalarization of FP selects because we can't do that as a
target-independent transform (although it seems likely that
targets besides x86 should have this transform).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355741 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 21:54:41 +00:00
Mitch Phillips
7f8f369b99 [HWASan] Save + print registers when tag mismatch occurs in AArch64.
Summary:
This change change the instrumentation to allow users to view the registers at the point at which tag mismatch occured. Most of the heavy lifting is done in the runtime library, where we save the registers to the stack and emit unwind information. This allows us to reduce the overhead, as very little additional work needs to be done in each __hwasan_check instance.

In this implementation, the fast path of __hwasan_check is unmodified. There are an additional 4 instructions (16B) emitted in the slow path in every __hwasan_check instance. This may increase binary size somewhat, but as most of the work is done in the runtime library, it's manageable.

The failure trace now contains a list of registers at the point of which the failure occured, in a format similar to that of Android's tombstones. It currently has the following format:

Registers where the failure occurred (pc 0x0055555561b4):
    x0  0000000000000014  x1  0000007ffffff6c0  x2  1100007ffffff6d0  x3  12000056ffffe025
    x4  0000007fff800000  x5  0000000000000014  x6  0000007fff800000  x7  0000000000000001
    x8  12000056ffffe020  x9  0200007700000000  x10 0200007700000000  x11 0000000000000000
    x12 0000007fffffdde0  x13 0000000000000000  x14 02b65b01f7a97490  x15 0000000000000000
    x16 0000007fb77376b8  x17 0000000000000012  x18 0000007fb7ed6000  x19 0000005555556078
    x20 0000007ffffff768  x21 0000007ffffff778  x22 0000000000000001  x23 0000000000000000
    x24 0000000000000000  x25 0000000000000000  x26 0000000000000000  x27 0000000000000000
    x28 0000000000000000  x29 0000007ffffff6f0  x30 00000055555561b4

... and prints after the dump of memory tags around the buggy address.

Every register is saved exactly as it was at the point where the tag mismatch occurs, with the exception of x16/x17. These registers are used in the tag mismatch calculation as scratch registers during __hwasan_check, and cannot be saved without affecting the fast path. As these registers are designated as scratch registers for linking, there should be no important information in them that could aid in debugging.

Reviewers: pcc, eugenis

Reviewed By: pcc, eugenis

Subscribers: srhines, kubamracek, mgorny, javed.absar, krytarowski, kristof.beyls, hiraditya, jdoerfert, llvm-commits, #sanitizers

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D58857

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355738 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 21:22:35 +00:00
Matt Arsenault
30e4d41861 AMDGPU: Move d16 load matching to preprocess step
When matching half of the build_vector to a load, there could still be
a hidden dependency on the other half of the build_vector the pattern
wouldn't detect. If there was an additional chain dependency on the
other value, a cycle could be introduced.

I don't think a tablegen pattern is capable of matching the necessary
conditions, so move this into PreprocessISelDAG. Check isPredecessorOf
for the other value to avoid a cycle. This has a warning that it's
expensive, so this should probably be moved into an MI pass eventually
that will have more freedom to reorder instructions to help match
this. That is currently complicated by the lack of a computeKnownBits
type mechanism for the selected function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355731 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 20:58:11 +00:00
Matt Arsenault
ed12070421 DAG: Don't try to cluster loads with tied inputs
This avoids breaking possible value dependencies when sorting loads by
offset.

AMDGPU has some load instructions that write into the high or low bits
of the destination register, and have a tied input for the other input
bits. These can easily have the same base pointer, but be a swizzle so
the high address load needs to come first. This was inserting glue
forcing the opposite ordering, producing a cycle the InstrEmitter
would assert on. It may be potentially expensive to look for the
dependency between the other loads, so just skip any where this could
happen.

Fixes bug 40936 by reverting r351379, which added a hacky attempt to
fix this by adding chains in this case, which I think was just working
around broken glue before the InstrEmitter. The core of the patch is
re-implementing the fix for that problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355728 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 20:46:15 +00:00
Sanjay Patel
c00fa49767 [x86] add tests for extracted vector FP cmp; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355727 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 20:45:27 +00:00
Matt Arsenault
6bc2eb6a5f AMDGPU: Add more tests for d16 loads
Also fix a few cases that weren't testing what they were supposed to.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355724 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 20:30:51 +00:00
Matt Arsenault
f031b0c044 AMDGPU: Correct DS implementation of areLoadsFromSameBasePtr
This was checking the wrong operands for the base register and the
offsets. The indexes are shifted by the number of output registers
from the machine instruction definition, and the chain is moved to the
end.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355722 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 20:30:50 +00:00
Amaury Sechet
bcd58e4ff6 [DAGCombiner] fold (add (add (xor a, -1), b), 1) -> (sub b, a)
Summary: This pattern is sometime created after legalization.

Reviewers: efriedma, spatel, RKSimon, zvi, bkramer

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58874

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355716 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 19:39:32 +00:00
Sanjay Patel
fea1ddbc35 [x86] prevent infinite looping from inverse shuffle transforms
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355713 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 19:20:28 +00:00
Simon Pilgrim
9808bce5cc [X86] Add test case for PR22473
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355712 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 19:16:26 +00:00
Simon Pilgrim
4530e3826d Fix typo in constant vector
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355699 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 15:17:26 +00:00
Clement Courbet
154171347b [SelectionDAG] Allow the user to specify a memeq function.
Summary:
Right now, when we encounter a string equality check,
e.g. `if (memcmp(a, b, s) == 0)`, we try to expand to a comparison if `s` is a
small compile-time constant, and fall back on calling `memcmp()` else.

This is sub-optimal because memcmp has to compute much more than
equality.

This patch replaces `memcmp(a, b, s) == 0` by `bcmp(a, b, s) == 0` on platforms
that support `bcmp`.

`bcmp` can be made much more efficient than `memcmp` because equality
compare is trivially parallel while lexicographic ordering has a chain
dependency.

Subscribers: fedor.sergeev, jyknight, ckennelly, gchatelet, llvm-commits

Differential Revision: https://reviews.llvm.org/D56593

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355672 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 09:07:45 +00:00
Carl Ritson
5b94cee930 [AMDGPU] V_CVT_F32_UBYTE{0,1,2,3} are full rate instructions
Summary: Fix a bug in the scheduling model where V_CVT_F32_UBYTE{0,1,2,3} are incorrectly marked as quarter rate instructions.

Reviewers: arsenm, rampitec

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59091

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355671 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 09:03:11 +00:00
Craig Topper
80bb69d69b [X86] Improve the type checking in isLegalMaskedLoad and isLegalMaskedGather.
We were just checking pointer size and type primitive size. But this caused unintended things like vectors of half being accepted by masked load/store.

For FP we now explicitly check for only double and float.

For pointers we now let any pointer through. Trusting that only 32 and 64 would be used to generate assembly.

We only check bitwidth after checking that the type is an integer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355667 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-08 07:33:43 +00:00
Sanjay Patel
81588cab8d [x86] add extract FP tests for target-specific nodes; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355655 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-07 23:55:54 +00:00
Konstantin Zhuravlyov
323ccae6ca AMDHSA: Code object v3 updates
- Copy kernel symbol attributes into kernel descriptor attributes
  - Make sure kernel symbol's visibility is not "higher" than protected

Differential Revision: https://reviews.llvm.org/D59057


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355630 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-07 19:58:29 +00:00
Vlad Tsyrklevich
863ea8c618 Delete x86_64 ShadowCallStack support
Summary:
ShadowCallStack on x86_64 suffered from the same racy security issues as
Return Flow Guard and had performance overhead as high as 13% depending
on the benchmark. x86_64 ShadowCallStack was always an experimental
feature and never shipped a runtime required to support it, as such
there are no expected downstream users.

Reviewers: pcc

Reviewed By: pcc

Subscribers: mgorny, javed.absar, hiraditya, jdoerfert, cfe-commits, #sanitizers, llvm-commits

Tags: #clang, #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D59034

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355624 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-07 18:56:36 +00:00
David Green
8babc52a2b [LSR] Attempt to increase the accuracy of LSR's setup cost
In some loops, we end up generating loop induction variables that look like:
  {(-1 * (zext i16 (%i0 * %i1) to i32))<nsw>,+,1}
As opposed to the simpler:
  {(zext i16 (%i0 * %i1) to i32),+,-1}
i.e we count up from -limit to 0, not the simpler counting down from limit to
0. This is because the scores, as LSR calculates them, are the same and the
second is filtered in place of the first. We end up with a redundant SUB from 0
in the code.

This patch tries to make the calculation of the setup cost a little more
thoroughly, recursing into the scev members to better approximate the setup
required. The cost function for comparing LSR costs is:

return std::tie(C1.NumRegs, C1.AddRecCost, C1.NumIVMuls, C1.NumBaseAdds,
                C1.ScaleCost, C1.ImmCost, C1.SetupCost) <
       std::tie(C2.NumRegs, C2.AddRecCost, C2.NumIVMuls, C2.NumBaseAdds,
                C2.ScaleCost, C2.ImmCost, C2.SetupCost);
So this will only alter results if none of the other variables turn out to be
different.

Differential Revision: https://reviews.llvm.org/D58770


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355597 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-07 13:44:40 +00:00
Petar Avramovic
deb1a9834b [MIPS GlobalISel] Fix mul operands
Unsigned mul high for MIPS32 is selected into two PseudoInstructions:
PseudoMULTu and PseudoMFHI that use accumulator register class ACC64 for
some of its operands. Registers in this class have appropriate hi and lo
register as subregisters: $lo0 and $hi0 are subregisters of $ac0 etc.
mul instruction implicit-defs $lo0 and $hi0 according to MipsInstrInfo.td.
In functions where mul and PseudoMULTu are present fastRegisterAllocator
will "run out of registers during register allocation" because
'calcSpillCost' for $ac0 will return spillImpossible because subregisters
$lo0 and $hi0 of $ac0 are reserved by mul instruction above. A solution is
to mark implicit-defs of $lo0 and $hi0 as dead in mul instruction.

Differential Revision: https://reviews.llvm.org/D58715


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355594 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-07 13:28:29 +00:00