Commit Graph

21777 Commits

Author SHA1 Message Date
Yonghong Song
ddfb9993af bpf: fix bug on silently truncating 64-bit immediate
We came across an llvm bug when compiling some testcases that 64-bit
immediates are silently truncated into 32-bit and then packed into
BPF_JMP | BPF_K encoding.  This caused comparison with wrong value.

This bug looks to be introduced by r308080 (llvm 5.0). The Select_Ri pattern is
supposed to be lowered into J*_Ri while the latter only support 32-bit
immediate encoding, therefore Select_Ri should have similar immediate
predicate check as what J*_Ri are doing.

The bug is fixed by
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315889 91177308-0d34-0410-b5e6-96231b3b80d8
in llvm 6.0.

This patch is largely the same as the fix in llvm 6.0 except
one minor adjustment for the test case.

Reported-by: John Fastabend <john.fastabend@gmail.com>
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Jiong Wang <jiong.wang@netronome.com>
Reviewed-by: Yonghong Song <yhs@fb.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@319633 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-03 19:02:03 +00:00
Tom Stellard
9bc5c88218 Merging r316035:
------------------------------------------------------------------------
r316035 | tnorthover | 2017-10-17 14:43:52 -0700 (Tue, 17 Oct 2017) | 6 lines

AArch64: account for possible frame index operand in compares.

If the address of a local is used in a comparison, AArch64 can fold the
address-calculation into the comparison via "adds". Unfortunately, a couple of
places (both hit in this one test) are not ready to deal with that yet and just
assume the first source operand is a register.
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@319231 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 22:02:15 +00:00
Tom Stellard
3e43049041 Merging r319130:
------------------------------------------------------------------------
r319130 | matze | 2017-11-27 17:17:52 -0800 (Mon, 27 Nov 2017) | 7 lines

ARM: Fix PR32578

https://llvm.org/PR32578

I simplified and converted the reproducer into a lit test.

Patch by Vedant Kumar!
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@319181 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 16:35:04 +00:00
Tom Stellard
f30c918816 Merging r318788:
------------------------------------------------------------------------
r318788 | mcrosier | 2017-11-21 10:08:34 -0800 (Tue, 21 Nov 2017) | 16 lines

[AArch64] Mark mrs of TPIDR_EL0 (thread pointer) as *having* side effects.

This partially reverts r298851.  The the underlying issue is that we don't
currently model the dependency between mrs (read system register) and
msr (write system register) instructions.

Something like the below should never be reordered:

 msr TPIDR_EL0, x0  ;; set thread pointer
 mrs x8, TPIDR_EL0  ;; read thread pointer

but was being reordered after r298851.  The functional part of the patch
that wasn't reverted needed to remain in place in order to not break
r299462.

PR35317
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@318854 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-22 18:04:47 +00:00
Tom Stellard
d27dcd3290 Merging r317204 and r318172:
------------------------------------------------------------------------
r317204 | sdardis | 2017-11-02 05:47:22 -0700 (Thu, 02 Nov 2017) | 15 lines

[mips] Use register scavenging with MSA.

MSA stores and loads to the stack are more likely to require an
emergency GPR spill slot due to the smaller offsets available
with those instructions.

Handle this by overestimating the size of the stack by determining
the largest offset presuming that all callee save registers are
spilled and accounting of incoming arguments when determining
whether an emergency spill slot is required.

Reviewers: atanasyan

Differential Revision: https://reviews.llvm.org/D39056

------------------------------------------------------------------------

------------------------------------------------------------------------
r318172 | sdardis | 2017-11-14 11:11:45 -0800 (Tue, 14 Nov 2017) | 5 lines

[mips] Simplify test for 5.0.1 (NFC)

Simplify testing that an emergency spill slot is used when MSA
is used so that it can be included in the 5.0.1 release.

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@318191 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-14 20:40:11 +00:00
Tom Stellard
16afcbb6b3 Merging r314798:
------------------------------------------------------------------------
r314798 | sdardis | 2017-10-03 06:45:49 -0700 (Tue, 03 Oct 2017) | 9 lines

[mips] Enable spilling and reloading of the dsp register set.

The dsp register class is an alias of the gpr register class, so
we have to define instructions for spilling and reloading.

Reviewers: atanasyan

Differential Revision: https://reviews.llvm.org/D38038

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@318183 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-14 19:54:26 +00:00
Tom Stellard
f042b00be5 Merging r315485:
------------------------------------------------------------------------
r315485 | spatel | 2017-10-11 11:24:21 -0700 (Wed, 11 Oct 2017) | 7 lines

[x86] avoid infinite loop from SoftenFloatOperand (PR34866)

Legalization of fp128 assumes things that we should have asserts for,
so that's another potential improvement.

Differential Revision: https://reviews.llvm.org/D38771

------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@316607 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-25 20:56:42 +00:00
Dylan McKay
d14542fb74 Merging r314896:
------------------------------------------------------------------------
r314896 | dylanmckay | 2017-10-04 23:33:36 +1300 (Wed, 04 Oct 2017) | 3 lines

[AVR] Elaborate LDWRdPtr into `ld r, X++; ld r+1, X`

Patch by Gergo Erdi.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@315834 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 22:30:19 +00:00
Dylan McKay
1f11ce6781 Merging r314891:
------------------------------------------------------------------------
r314891 | dylanmckay | 2017-10-04 22:51:28 +1300 (Wed, 04 Oct 2017) | 8 lines

[AVR] Insert JMP for long branches

Previously, on long branches (relative jumps of >4 kB), an assertion
failure was hit, as AVRInstrInfo::insertIndirectBranch was not
implemented. Despite its name, it is called by the branch relaxator
for *all* unconditional jumps.

Patch by Thomas Backman.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@315833 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 22:29:48 +00:00
Dylan McKay
1fc9dfdeac Merging r314890:
------------------------------------------------------------------------
r314890 | dylanmckay | 2017-10-04 22:51:21 +1300 (Wed, 04 Oct 2017) | 16 lines

[AVR] Fix displacement overflow for LDDW/STDW

In some cases, the code generator attempts to generate instructions such as:

lddw r24, Y+63

which expands to:

ldd r24, Y+63
ldd r25, Y+64 # Oops! This is actually ld r25, Y in the binary

This commit limits the first offset to 62, and thus the second to 63.
It also updates some asserts in AVRExpandPseudoInsts.cpp, including for
INW and OUTW, which appear to be unused.

Patch by Thomas Backman.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@315832 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 22:29:18 +00:00
Craig Topper
cbc2c76b28 Merging r313366:
------------------------------------------------------------------------
r313366 | ctopper | 2017-09-15 10:09:03 -0700 (Fri, 15 Sep 2017) | 9 lines

[X86] Don't create i64 constants on 32-bit targets when lowering v64i1 constant build vectors

When handling a v64i1 build vector of constants on 32-bit targets we were creating an illegal i64 constant that we then bitcasted back to v64i1. We need to instead create two 32-bit constants, bitcast them to v32i1 and concat the result. We should also take care to handle the halves being all zeros/ones after the split.

This patch splits the build vector and then recursively lowers the two pieces. This allows us to handle the all ones and all zeros cases with minimal effort. Ideally we'd just do the split and concat, and let lowering get called again on the new nodes, but getNode has special handling for CONCAT_VECTORS that reassembles the pieces back into a single BUILD_VECTOR. Hopefully the two temporary BUILD_VECTORS we had to create to do this that don't get returned don't cause any issues.

Fixes PR34605.

Differential Revision: https://reviews.llvm.org/D37858
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@315198 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-08 23:49:29 +00:00
Dylan McKay
9bc59d4c9b Revert r314892
It was accidentally merged.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314893 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-04 10:06:12 +00:00
Dylan McKay
3f312d2ca3 Merging r314891:
------------------------------------------------------------------------
r314891 | dylanmckay | 2017-10-04 22:51:28 +1300 (Wed, 04 Oct 2017) | 8 lines

[AVR] Insert JMP for long branches

Previously, on long branches (relative jumps of >4 kB), an assertion
failure was hit, as AVRInstrInfo::insertIndirectBranch was not
implemented. Despite its name, it is called by the branch relaxator
for *all* unconditional jumps.

Patch by Thomas Backman.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314892 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-04 10:01:09 +00:00
Renato Golin
f1ff67e5f1 [release_50] Merging r313916
[AArch64] Fix bug in store of vector 0 DAGCombine.

Summary:
Avoid using XZR/WZR directly as operands to split stores of zero
vectors.  Doing so can lead to the XZR/WZR being used by an instruction
that doesn't allow it (e.g. add).

Fixes bug 34674.

Reviewers: t.p.northover, efriedma, MatzeB

Subscribers: aemerson, rengolin, javed.absar, mcrosier, eraman, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38146

PR34695.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314796 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-03 13:13:34 +00:00
Tom Stellard
1122b45a01 Merging r312348:
------------------------------------------------------------------------
r312348 | matze | 2017-09-01 11:36:26 -0700 (Fri, 01 Sep 2017) | 39 lines

LiveIntervalAnalysis: Fix alias regunit reserved definition

A register in CodeGen can be marked as reserved: In that case we
consider the register always live and do not use (or rather ignore)
kill/dead/undef operand flags.

LiveIntervalAnalysis however tracks liveness per register unit (not per
register). We already needed adjustments for this in r292871 to deal
with super/sub registers. However I did not look at aliased register
there. Looking at ARM:

FPSCR (regunits FPSCR, FPSCR~FPSCR_NZCV) aliases with FPSCR_NZCV
(regunits FPSCR_NZCV, FPSCR~FPSCR_NZCV) hence they share a register unit
(FPSCR~FPSCR_NZCV) that represents the aliased parts of the registers.
This shared register unit was previously considered non-reserved,
however given that we uses of the reserved FPSCR potentially violate
some rules (like uses without defs) we should make FPSCR~FPSCR_NZCV
reserved too and stop tracking liveness for it.

This patch:
- Defines a register unit as reserved when: At least for one root
  register, the root register and all its super registers are reserved.
- Adjust LiveIntervals::computeRegUnitRange() for new reserved
  definition.
- Add MachineRegisterInfo::isReservedRegUnit() to have a canonical way
  of testing.
- Stop computing LiveRanges for reserved register units in HMEditor even
  with UpdateFlags enabled.
- Skip verification of uses of reserved reg units in the machine
  verifier (this usually didn't happen because there would be no cached
  liverange but there is no guarantee for that and I would run into this
  case before the HMEditor tweak, so may as well fix the verifier too).

Note that this should only affect ARMs FPSCR/FPSCR_NZCV registers today;
aliased registers are rarely used, the only other cases are hexagons
P0-P3/P3_0 and C8/USR pairs which are not mixing reserved/non-reserved
registers in an alias.

Differential Revision: https://reviews.llvm.org/D37356
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314565 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 23:22:57 +00:00
Tom Stellard
b5ef7376bc Merging r314252:
------------------------------------------------------------------------
r314252 | gberry | 2017-09-26 14:40:46 -0700 (Tue, 26 Sep 2017) | 12 lines

[AArch64][Falkor] Fix bug in falkor prefetcher fix pass.

Summary:
In rare cases, loads that don't get prefetched that were marked as
strided loads could cause a crash if they occurred in a loop with other
colliding loads.

Reviewers: mcrosier

Subscribers: aemerson, rengolin, javed.absar, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38261
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314555 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 20:35:06 +00:00
Tom Stellard
50ee711c34 Merging r314251:
------------------------------------------------------------------------
r314251 | gberry | 2017-09-26 14:40:41 -0700 (Tue, 26 Sep 2017) | 16 lines

[AArch64][Falkor] Fix correctness bug in falkor prefetcher fix pass and correct some opcode tag computations.

Summary:
This addresses a correctness bug for LD[1234]*_POST opcodes that have
the prefetcher fix applied to them: the base register was not being
written back from the temp after being incremented, so it would appear
to never be incremented.

Also, fix some opcode tag computations based on some updated HW details
to get better tag avoidance and thus better prefetcher performance.

Reviewers: mcrosier

Subscribers: aemerson, rengolin, javed.absar, kristof.beyls

Differential Revision: https://reviews.llvm.org/D38256
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314554 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 20:30:55 +00:00
Joerg Sonnenberger
43bd69c68c Merging r311921:
------------------------------------------------------------------------
r311921 | joerg | 2017-08-28 22:20:47 +0200 (Mon, 28 Aug 2017) | 16 lines

Fix ARMv4 support

ARMv4 doesn't support the "BX" instruction, which has been introduced
with ARMv4t. Adjust the call lowering and tail call implementation
accordingly.

Further changes are necessary to ensure that presence of the v4t feature
is correctly set. Most importantly, the "generic" CPU for thumb-*
triples should include ARMv4t, since thumb mode without thumb support
would naturally be pointless.

Add a couple of asserts to ensure thumb instructions are not emitted
without CPU support.

Differential Revision: https://reviews.llvm.org/D37030

------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314417 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-28 14:13:54 +00:00
Dylan McKay
93c290035c Merging r314183:
------------------------------------------------------------------------
r314183 | dylanmckay | 2017-09-26 15:07:54 +1300 (Tue, 26 Sep 2017) | 3 lines

[AVR] Fix the build after setting alignment to 1 in r314179

Changing all types to be byte-aligned broke a small number of tests.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314382 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-28 07:13:51 +00:00
Dylan McKay
9e83867929 Merging r311620:
------------------------------------------------------------------------
r311620 | dylanmckay | 2017-08-24 12:14:38 +1200 (Thu, 24 Aug 2017) | 1 line

[AVR] Use the correct register classes for 16-bit atomic operations
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314358 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-27 22:18:57 +00:00
Dylan McKay
bc9510e749 Merging r312905:
------------------------------------------------------------------------
r312905 | dylanmckay | 2017-09-11 22:32:51 +1200 (Mon, 11 Sep 2017) | 10 lines

[AVR] Enable the '__do_copy_data' function

Also enables '__do_clear_bss'.

These functions are automaticalled called by the CRT if they are
declared.

We need these to be called otherwise RAM will start completely
uninitialised, even though we need to copy RAM variables from progmem to
RAM.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314356 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-27 22:15:50 +00:00
Tom Stellard
164431e600 Merging r312337:
------------------------------------------------------------------------
r312337 | nha | 2017-09-01 09:56:32 -0700 (Fri, 01 Sep 2017) | 12 lines

AMDGPU: IMPLICIT_DEFs and DBG_VALUEs do not contribute to wait states

Summary:
This fixes a bug that was exposed on gfx9 in various
GL45-CTS.shaders.loops.*_iterations.select_iteration_count_fragment tests,
e.g. GL45-CTS.shaders.loops.do_while_uniform_iterations.select_iteration_count_fragment

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D36193
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314327 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-27 18:08:25 +00:00
Tom Stellard
b34136ea85 Revert "Merging r312337:"
This reverts commit r314324.

I unintentionally deleted most of the svn:mergeinfo for the release_50
branch with this commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314326 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-27 18:06:46 +00:00
Tom Stellard
a6177dc4c3 Merging r312337:
------------------------------------------------------------------------
r312337 | nha | 2017-09-01 09:56:32 -0700 (Fri, 01 Sep 2017) | 12 lines

AMDGPU: IMPLICIT_DEFs and DBG_VALUEs do not contribute to wait states

Summary:
This fixes a bug that was exposed on gfx9 in various
GL45-CTS.shaders.loops.*_iterations.select_iteration_count_fragment tests,
e.g. GL45-CTS.shaders.loops.do_while_uniform_iterations.select_iteration_count_fragment

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D36193
------------------------------------------------------------------------

git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314324 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-27 17:56:19 +00:00
Hans Wennborg
def6ebbd31 Merging r311623:
------------------------------------------------------------------------
r311623 | hans | 2017-08-23 18:08:27 -0700 (Wed, 23 Aug 2017) | 11 lines

[DAG] Fix Node Replacement in PromoteIntBinOp

When one operand is a user of another in a promoted binary operation
we may replace and delete the returned value before returning
triggering an assertion. Reorder node replacements to prevent this.

Fixes PR34137.

Landing on behalf of Nirav.

Differential Revision: https://reviews.llvm.org/D36581
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311670 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-24 16:16:07 +00:00
Hans Wennborg
c14202c45b Revert r307529 "This patch completely replaces the scheduling information for the SandyBridge architecture"
This caused PR34080, which seems to have been fixed by r310792, but that change
introduced severe performance regressions.

Reverting to unblock the 5.0.0 release while these issues are worked out on trunk.

Also reverting a few tests that were added later and depended on the new scheduling:

    LLVM :: CodeGen/X86/f16c-schedule.ll
    LLVM :: CodeGen/X86/lea32-schedule.ll
    LLVM :: CodeGen/X86/lea64-schedule.ll
    LLVM :: CodeGen/X86/popcnt-schedule.ll


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311600 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 21:17:15 +00:00
Hans Wennborg
a8f7dd91f2 Merging r311572:
------------------------------------------------------------------------
r311572 | ctopper | 2017-08-23 09:41:02 -0700 (Wed, 23 Aug 2017) | 9 lines

[AVX512] Don't create SHRUNKBLEND SDNodes for 512-bit vectors

There are no 512-bit blend instructions so we shouldn't create SHRUNKBLEND for them.

On a side note, it looks like there may be a missed opportunity for constant folding TESTM when LHS and RHS are equal.

This fixes PR34139.

Differential Revision: https://reviews.llvm.org/D36992
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311593 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-23 20:03:43 +00:00
Hans Wennborg
9347fa6ce8 Merging r311263:
------------------------------------------------------------------------
r311263 | ctopper | 2017-08-19 15:02:02 -0700 (Sat, 19 Aug 2017) | 1 line

[AVX512] Use alignedstore256 in a pattern that's emitting a 256-bit movaps from an extract subvector operation.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311478 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 17:44:59 +00:00
Hans Wennborg
eb7354c3b1 Merging r311429:
------------------------------------------------------------------------
r311429 | ctopper | 2017-08-21 22:40:17 -0700 (Mon, 21 Aug 2017) | 9 lines

[X86] Prevent several calls to ISD::isConstantSplatVector from returning a narrower APInt than the original scalar type

ISD::isConstantSplatVector can shrink to the smallest splat width. But we don't check the size of the resulting APInt at all. This can cause us to misinterpret the results.

This patch just adds a flag to prevent the APInt from changing width.

Fixes PR34271.

Differential Revision: https://reviews.llvm.org/D36996
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311462 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 16:17:32 +00:00
Hans Wennborg
37c9f925e3 Merging r311061:
------------------------------------------------------------------------
r311061 | compnerd | 2017-08-16 19:42:24 -0700 (Wed, 16 Aug 2017) | 10 lines

ARM: mark CPSR as clobbered for Windows VLAs

When lowering a VLA, we emit a __chstk call.  However, this call can
internally clobber CPSR.  We did not mark this register as an ImpDef,
which could potentially allow a comparison to be hoisted above the call
to `__chkstk`.  In such a case, the CPSR could be clobbered, and the
check invalidated.  When the support was initially added, it seemed that
the call would take care of preventing CPSR from being clobbered, but
this is not the case.  Mark the register as clobbered to fix a possible
state corruption.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311461 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-22 16:09:55 +00:00
Hans Wennborg
94bd5ab124 Merging r311071:
------------------------------------------------------------------------
r311071 | eladcohen | 2017-08-17 01:06:36 -0700 (Thu, 17 Aug 2017) | 13 lines

[SelectionDAG] Teach the vector-types operand scalarizer about SETCC

When v1i1 is legal (e.g. AVX512) the legalizer can reach
a case where a v1i1 SETCC with an illgeal vector type operand
wasn't scalarized (since v1i1 is legal) but its operands does
have to be scalarized. This used to assert because SETCC was
missing from the vector operand scalarizer.

This patch attemps to teach the legalizer to handle these cases
by scalazring the operands, converting the node into a scalar
SETCC node.

Differential revision: https://reviews.llvm.org/D36651
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311409 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 23:28:04 +00:00
Hans Wennborg
a009ec4414 Merging r311258:
------------------------------------------------------------------------
r311258 | mstorsjo | 2017-08-19 12:47:48 -0700 (Sat, 19 Aug 2017) | 9 lines

[ARM] Check the right order for halves of VZIP/VUZP if both parts are used

This is the exact same fix as in SVN r247254. In that commit, the fix was
applied only for isVTRNMask and isVTRN_v_undef_Mask, but the same issue
is present for VZIP/VUZP as well.

This fixes PR33921.

Differential Revision: https://reviews.llvm.org/D36899
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311369 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-21 19:54:09 +00:00
Hans Wennborg
33d9c6a253 Merging r310066:
------------------------------------------------------------------------
r310066 | mcrosier | 2017-08-04 09:44:06 -0700 (Fri, 04 Aug 2017) | 4 lines

[AArch64] Fix an assertion for pre-index generation with unscaled loads/stores.

Differential Revision: https://reviews.llvm.org/D36248
PR34035
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311192 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-18 20:25:45 +00:00
Hans Wennborg
6c71d138bc Merging r310498:
------------------------------------------------------------------------
r310498 | guyblank | 2017-08-09 10:21:01 -0700 (Wed, 09 Aug 2017) | 9 lines

[X86][AVX512] Choose correct registers in vpbroadcastb/w

Fixes the vpbroadcastb/w instructions which use GPRs as source operands, to use the correct registers.
The full GPR should be used, and not the subregister, as it happens before the patch.

Fixes pr33795

Differential Revision:
https://reviews.llvm.org/D36479
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311110 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 17:42:21 +00:00
Hans Wennborg
57d2db4719 Merging r310979:
------------------------------------------------------------------------
r310979 | qcolombet | 2017-08-15 17:17:05 -0700 (Tue, 15 Aug 2017) | 38 lines

[VirtRegRewriter] Properly model the register liveness on undef subreg definition

Undef subreg definition means that the content of the super register
doesn't matter at this point. While that's true for virtual registers,
this may not hold when replacing them with actual physical registers.
Indeed, some part of the physical register may be coalesced with the
related virtual register and thus, the values for those parts matter and
must be live.

The fix consists in checking whether or not subregs of the physical register
being assigned to an undef subreg definition are live through that def and
insert an implicit use if they are. Doing so, will keep them alive until
that point like they should be.

E.g., let vreg14 being assigned to R0_R1 then
%vreg14:gsub_0<def,read-undef> = COPY %R0 ; <-- R1 is still live here
%vreg14:gsub_1<def> = COPY %R1

Before this changes, the rewriter would change the code into:
%R0<def> = KILL %R0, %R0_R1<imp-def> ; <-- this tells R1 is redefined
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> ; this value of this R1
                                                      ; is believed to come
                                                      ; from the previous
                                                      ; instruction

Because of this invalid liveness, later pass could make wrong choices and in
particular clobber live register as it happened with the register scavenger in
llvm.org/PR34107

Now we would generate:
%R0<def> = KILL %R0, %R0_R1<imp-def>, %R0_R1<imp-use> ; This tells R1 needs to
                                                      ; reach this point
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use>

The bug has been here forever, it got exposed recently because the register
scavenger got smarter.

Fixes llvm.org/PR34107
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311108 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-17 17:36:27 +00:00
Hans Wennborg
801c03eec9 Merging r310784:
------------------------------------------------------------------------
r310784 | ctopper | 2017-08-12 13:19:44 -0700 (Sat, 12 Aug 2017) | 16 lines

[X86] When handling addcarry intrinsic, create the flag result with the correct type so we don't crash if we use a memory instruction

Summary:
Previously we were creating the flag result with MVT::Other which is interpretted as a Chain node. If we used a memory form of the instruction we would end up with a copyToReg that consumed the chain result of the adcx instruction instead of the flag result.

Pretty sure we should be using MVT::i32 here, that's what we do other places we create these node types.

We should probably consider this for 5.0 as well.

Reviewers: RKSimon, zvi, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36645
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@310899 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-15 00:09:10 +00:00
Hans Wennborg
05ef777d84 Merging r310604:
------------------------------------------------------------------------
r310604 | niravd | 2017-08-10 08:12:32 -0700 (Thu, 10 Aug 2017) | 13 lines

[X86] Keep dependencies when constructing loads in combineStore

Summary:
Preserve chain dependecies between old and new loads constructed to
prevent loads from reordering below later stores.

Fixes PR34088.

Reviewers: craig.topper, spatel, RKSimon, efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36528
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@310678 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-11 01:53:40 +00:00
Hans Wennborg
d10901a8de Merging r309614:
------------------------------------------------------------------------
r309614 | kbelochapka | 2017-07-31 13:11:49 -0700 (Mon, 31 Jul 2017) | 7 lines

[X86][MMX] Added custom lowering action for MMX SELECT (PR30418)
Fix for pr30418 - error in backend: Cannot select: t17: x86mmx = select_cc t2, Constant:i64<0>, t7, t8, seteq:ch
Differential Revision: https://reviews.llvm.org/D34661




------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@310673 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-11 00:51:35 +00:00
Hans Wennborg
999f7b7049 Merging r310552:
------------------------------------------------------------------------
r310552 | eladcohen | 2017-08-10 00:44:23 -0700 (Thu, 10 Aug 2017) | 19 lines

[SelectionDAG] When scalarizing vselect, don't assert on
a legal cond operand.

When scalarizing the result of a vselect, the legalizer currently expects
to already have scalarized the operands. While this is true for the true/false
operands (which have the same type as the result), it is not case for the
condition operand. On X86 AVX512, v1i1 is legal - this leads to operations such
as '< N x type> vselect < N x i1> < N x type> < N x type>' where < N x type > is
illegal to hit an assertion during the scalarization.

The handling is similar to r205625.
This also exposes the fact that (v1i1 extract_subvector) should be legal
and selectable on AVX512 - We do this by custom lowering to vector_extract_elt.
This still leaves us in some cases with redundant dag nodes which will be
combined in a separate soon to come patch.

This fixes pr33349.

Differential revision: https://reviews.llvm.org/D36511
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@310635 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-10 17:36:36 +00:00
Hans Wennborg
dc84a08de8 Merging r310534:
------------------------------------------------------------------------
r310534 | matze | 2017-08-09 15:22:05 -0700 (Wed, 09 Aug 2017) | 20 lines

ARM: Fix CMP_SWAP expansion

Clean up after my misguided attempt in r304267 to "fix" CMP_SWAP
returning an uninitialized status value.

- I was always using tMOVi8 to zero the status register which cannot
  encode higher register numbers and llvm would silently miscompile)

- Nobody was ever looking at that status value outside the expansion.
  ARMDAGToDAGISel::SelectCMP_SWAP() the only place creating CMP_SWAP
  instructions was not mapping anything to it. (The cmpxchg status value
  from llvm IR is lowered to a manual comparison after the CMP_SWAP)

So this:
- Renames the register from "status" to "temp" it make it obvious that
  it isn't used outside the expansion.
- Remove the zeroing status/temp register.
- Keep the live-in list improvements from r304267

Fixes http://llvm.org/PR34056
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@310628 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-10 17:12:27 +00:00
Hans Wennborg
1d8866decc Merging r310190:
------------------------------------------------------------------------
r310190 | ctopper | 2017-08-05 16:34:44 -0700 (Sat, 05 Aug 2017) | 18 lines

[X86] Enable isel to use the PAUSE instruction even when SSE2 is disabled

Summary:
On older processors this instruction encoding is treated as a NOP.

MSVC doesn't disable intrinsics based on features the way clang/gcc does. Because the PAUSE instruction encoding doesn't crash older processors, some software out there uses these intrinsics without checking for SSE2.

This change also seems to also be consistent with gcc behavior.

Fixes PR34079

Reviewers: RKSimon, zvi

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D36361
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@310293 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-07 20:14:13 +00:00
Hans Wennborg
f632da1f79 Merging r309930:
------------------------------------------------------------------------
r309930 | sdardis | 2017-08-03 02:38:46 -0700 (Thu, 03 Aug 2017) | 19 lines

[SelectionDAG] Resolve PR33978.

rL306209 taught SelectionDAG how to add the dereferenceable flag when
expanding memcpy and memmove. The fix however contained a nit where
the offset + size was constructed as an APInt of PointerSize rather
than PointerSizeInBits.

This lead to isDereferenceableAndAlignedPointer() get truncated values or
values which would be sign extended within that function leading to
incorrect results.

Thanks to Alex Crichton for reporting the issue!

This resolves PR33978.

Reviewers: inouehrs

Differential Revision: https://reviews.llvm.org/D36236

------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309956 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-03 16:24:57 +00:00
Hans Wennborg
6db41ec7ee Merging r309744:
------------------------------------------------------------------------
r309744 | mstorsjo | 2017-08-01 14:13:54 -0700 (Tue, 01 Aug 2017) | 29 lines

[AArch64] Rewrite stack frame handling for win64 vararg functions

The previous attempt, which made do with a single offset in
computeCalleeSaveRegisterPairs, wasn't quite enough. The previous
attempt only worked as long as CombineSPBump == true (since the
offset would be adjusted later in fixupCalleeSaveRestoreStackOffset).

Instead include the size for the fixed stack area used for win64
varargs in calculations in emitPrologue/emitEpilogue. The stack
consists of mainly three parts;
- AFI->getLocalStackSize()
- AFI->getCalleeSavedStackSize()
- FixedObject

Most of the places in the code which previously used the CSStackSize
now use PrologueSaveSize instead, which is the sum of the latter
two, while some cases which need exactly the middle one use
AFI->getCalleeSavedStackSize() explicitly instead of a local variable.

In addition to moving the offsetting into emitPrologue/emitEpilogue
(which fixes functions with CombineSPBump == false), also set the
frame pointer to point to the right location, where the frame pointer
and link register actually are stored. In addition to the prologue/epilogue,
this also requires changes to resolveFrameIndexReference.

Add tests for a function that keeps a frame pointer and another one
that uses a VLA.

Differential Revision: https://reviews.llvm.org/D35919
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309843 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-02 17:38:46 +00:00
Hans Wennborg
09c8b0603b Merging r309561:
------------------------------------------------------------------------
r309561 | sdardis | 2017-07-31 07:06:58 -0700 (Mon, 31 Jul 2017) | 14 lines

[SelectionDAG][mips] Fix PR33883

PR33883 shows that calls to intrinsic functions should not have their vector
arguments or returns subject to ABI changes required by the target.

This resolves PR33883.

Thanks to Alex Crichton for reporting the issue!

Reviewers: zoran.jovanovic, atanasyan

Differential Revision: https://reviews.llvm.org/D35765


------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309767 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 23:44:38 +00:00
Hans Wennborg
498863ba87 Merging r309495:
------------------------------------------------------------------------
r309495 | fhahn | 2017-07-29 13:35:28 -0700 (Sat, 29 Jul 2017) | 30 lines

[AArch64] Tie source and destination operands for AESMC/AESIMC. 

Summary:
Most CPUs implementing AES fusion require instruction pairs of the form
    AESE Vn, _
    AESMC Vn, Vn
and
    AESD Vn, _
    AESIMC Vn, Vn

The constraint is added to AES(I)MC instructions which use the result of
an AES(E|D) instruction by using AES(I)MCTrr pseudo instructions, which
constraint source and destination registers to be the same.

A nice side effect of this change is that now all possible pairs are
scheduled back-to-back on the exynos-m1 for the misched-fusion-aes.ll
test case.

I had to update aes_load_store. The version I added initially was very
reduced and with the new constraint, AESE/AESMC could not be scheduled
back-to-back. I updated the test to be more realistic and still expose
the same scheduling problem as the initial test case.

Reviewers: t.p.northover, rengolin, evandro, kristof.beyls, silviu.baranga

Reviewed By: t.p.northover, evandro

Subscribers: aemerson, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D35299
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309765 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-01 23:38:46 +00:00
Hans Wennborg
7b70f013ab Merging r309323:
------------------------------------------------------------------------
r309323 | ab | 2017-07-27 14:27:25 -0700 (Thu, 27 Jul 2017) | 12 lines

[AArch64] Fix legality info passed to demanded bits for TBI opt.

The (seldom-used) TBI-aware optimization had a typo lying dormant since
it was first introduced, in r252573:  when asking for demanded bits, it
told TLI that it was running after legalize, where the opposite was
true.

This is an important piece of information, that the demanded bits
analysis uses to make assumptions about the node.  r301019 added such an
assumption, which was broken by the TBI combine.

Instead, pass the correct flags to TLO.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309586 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 17:17:43 +00:00
Hans Wennborg
e49f89330a Merging r309343:
------------------------------------------------------------------------
r309343 | rnk | 2017-07-27 17:58:35 -0700 (Thu, 27 Jul 2017) | 16 lines

[X86] Fix latent bug in sibcall eligibility logic

The X86 tail call eligibility logic was correct when it was written, but
the addition of inalloca and argument copy elision broke its
assumptions. It was assuming that fixed stack objects were immutable.

Currently, we aim to emit a tail call if no arguments have to be
re-arranged in memory. This code would trace the outgoing argument
values back to check if they are loads from an incoming stack object.
If the stack argument is immutable, then we won't need to store it back
to the stack when we tail call.

Fortunately, stack objects track their mutability, so we can just make
the obvious check to fix the bug.

This was http://crbug.com/749826
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309577 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 16:45:02 +00:00
Hans Wennborg
56b1ef7698 Merging r309422:
------------------------------------------------------------------------
r309422 | rnk | 2017-07-28 12:48:40 -0700 (Fri, 28 Jul 2017) | 25 lines

Fix conditional tail call branch folding when both edges are the same

The conditional tail call logic did the wrong thing when both
destinations of a conditional branch were the same:

BB#1: derived from LLVM BB %entry
    Live Ins: %EFLAGS
    Predecessors according to CFG: BB#0
        JE_1 <BB#5>, %EFLAGS<imp-use,kill>
        JMP_1 <BB#5>

BB#5: derived from LLVM BB %sw.epilog
    Predecessors according to CFG: BB#1
        TCRETURNdi64 <ga:@mergeable_conditional_tailcall>, 0, ...

We would fold the JE_1 to a TCRETURNdi64cc, and then remove our BB#5
successor. Then BB#5 would be deleted as it had no predecessors, leaving
a dangling "JMP_1 <BB#5>" reference behind to cause assertions later.

This patch checks that both conditional branch destinations are
different before doing the transform. The standard branch folding logic
is able to remove both the JMP_1 and the JE_1, and for my test case we
end up forming a better conditional tail call later.

Fixes PR33980
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309574 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-31 16:41:22 +00:00
Hans Wennborg
43303b98f5 Merging r309302:
------------------------------------------------------------------------
r309302 | rksimon | 2017-07-27 11:15:54 -0700 (Thu, 27 Jul 2017) | 3 lines

[SelectionDAG] Improve DAGTypeLegalizer::convertMask assertion (PR33960)

Improve DAGTypeLegalizer::convertMask's isSETCCorConvertedSETCC assertion to properly check for any mixture of SETCC or BUILD_VECTOR of constants, or a logical mask op of them.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309348 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-28 01:32:26 +00:00
Hans Wennborg
de90d748dd Merging r308808, r308813 and r308906:
------------------------------------------------------------------------
r308808 | arsenm | 2017-07-21 16:56:13 -0700 (Fri, 21 Jul 2017) | 6 lines

RA: Remove assert on empty live intervals

This is possible if there is an undef use when
splitting the vreg during spilling.

Fixes bug 33620.
------------------------------------------------------------------------

------------------------------------------------------------------------
r308813 | arsenm | 2017-07-21 17:24:01 -0700 (Fri, 21 Jul 2017) | 6 lines

RA: Remove another assert on empty intervals

This case is similar to the one fixed in r308808,
except when rematerializing.

Fixes bug 33884.
------------------------------------------------------------------------

------------------------------------------------------------------------
r308906 | arsenm | 2017-07-24 11:07:55 -0700 (Mon, 24 Jul 2017) | 6 lines

RA: Replace asserts related to empty live intervals

These don't exactly assert the same thing anymore, and
allow empty live intervals with non-empty uses.

Removed in r308808 and r308813.
------------------------------------------------------------------------


git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309171 91177308-0d34-0410-b5e6-96231b3b80d8
2017-07-26 20:34:36 +00:00