Commit Graph

1278 Commits

Author SHA1 Message Date
Cong Hou
5ae2b850eb Normalize MBB's successors' probabilities in several locations.
This patch adds some missing calls to MBB::normalizeSuccProbs() in several
locations where it should be called. Those places are found by checking if the
sum of successors' probabilities is approximate one in MachineBlockPlacement
pass with some instrumented code (not in this patch).


Differential revision: http://reviews.llvm.org/D15259




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255455 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-13 09:26:17 +00:00
Hal Finkel
15c5be1ee5 Revert r248483, r242546, r242545, and r242409 - absdiff intrinsics
After much discussion, ending here:

  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html

it has been decided that, instead of having the vectorizer directly generate
special absdiff and horizontal-add intrinsics, we'll recognize the relevant
reduction patterns during CodeGen. Accordingly, these intrinsics are not needed
(the operations they represent can be pattern matched, as is already done in
some backends). Thus, we're backing these out in favor of the current
development work.

r248483 - Codegen: Fix llvm.*absdiff semantic.
r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA
r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA
r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255387 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-11 23:11:52 +00:00
Ahmed Bougacha
65422bf239 [AArch64][ARM] Don't base interleaved op legality on type alloc size.
Otherwise, we think that most types that look like they'd fit in a
legal vector type are legal (so, basically, *any* vector type with a
size between 33 and 128 bits, I think, since we use pow2 alignment;
e.g., v2i25, v3f32, ...).

DataLayout::getTypeAllocSize rounds up based on alignment.
When checking for target intrinsic legality, that's not what we want:
if rounding makes a difference, the type isn't legal, and the
target intrinsics shouldn't be used, as they are always assumed legal.

One could make the argument that alloc size is ultimately the most
relevant here, since we're dealing with LD/ST intrinsics. That's only
true if we did legalize them though; that's a problem for another day.

Use DataLayout::getTypeSizeInBits instead of getTypeAllocSizeInBits.
Type::getSizeInBits can't be used because that'd gratuitously break
pointer vector support.

Some of these uses are currently fine, because we only hit them when
the type is already known legal (e.g., r114454). Update them for
consistency. It's faster to avoid the rounding anyway!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255089 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-09 01:19:50 +00:00
Quentin Colombet
c445f0fb72 [ARM] When a bitcast is about to be turned into a VMOVDRR, try to combine it
with its source instead of forcing the values on GPRs.

This improves the lowering of vector code when such bitcasts happen in the
middle of vector computations.

rdar://problem/23691584 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254684 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-04 01:53:14 +00:00
Tim Northover
57b1a9599b AArch64: use ldxp/stxp pair to implement 128-bit atomic loads.
The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic
if there has been a corresponding successful stxp. It's less clear for AArch32, so
I'm leaving that alone for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254524 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-02 18:12:57 +00:00
Cong Hou
5155021519 Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces.
(This is the second attempt to submit this patch. The first caused two assertion
 failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687)

The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.

All uses of weight-based interfaces are now updated to use probability-based
ones.


Differential revision: http://reviews.llvm.org/D14973




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254377 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-01 05:29:22 +00:00
Hans Wennborg
8e83fe2e97 Revert r254348: "Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces."
and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction."

Asserts were firing in Chromium builds. See PR25687.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254366 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-01 03:49:42 +00:00
Cong Hou
92989cbe84 Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces.
The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.

All uses of weight-based interfaces are now updated to use probability-based
ones.


Differential revision: http://reviews.llvm.org/D14973




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254348 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-01 00:02:51 +00:00
Martell Malone
937e2d588c ARM: address WOA unsigned division overflow crash
Building on r253865 the crash is not limited to signed overflows.

Disable custom handling of unsigned 32-bit and 64-bit integer divide.
Add test cases for both 32-bit and 64-bit unsigned integer overflow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254158 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-26 15:34:03 +00:00
Artyom Skrobov
824e14ddab Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC)
Summary:
Many target lowerings copy-paste the code to test SDValues for known constants.
This code can instead be shared in SelectionDAG.cpp, and reused in the targets.

Reviewers: MatzeB, andreadb, tstellarAMD

Subscribers: arsenm, jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D14945

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254085 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-25 19:41:11 +00:00
Martell Malone
ee54187984 ARM: address WoA division overflow crash
Disable custom handling of signed 32-bit and 64-bit integer divide.
Add test cases for both 32-bit and 64-bit integer overflow crashes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253865 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-23 13:11:39 +00:00
James Molloy
61639fb856 Properly check if a CMPZ node is in fact comparing against zero
This was left implicit and never ever checked, which means we could have a CMPZ against some non-zero value and we were carrying on with BFI conversion regardless.

Caught by Oliver Stannard using csmith; regression test added.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253195 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-16 10:49:25 +00:00
James Molloy
b5ab3ba365 [ARM] Replace ARMISD::RBIT with ISD::BITREVERSE
ISD::BITREVERSE matches "rbit" completely, so remove ARMISD::RBIT and mark ISD::BITREVERSE as legal, adding a test for lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253047 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-13 16:05:22 +00:00
James Molloy
b5caa9fd56 [ARM] CMOV->BFI combining: handle both senses of CMPZ
I completely misunderstood what ARMISD::CMPZ means. It's not "compare equal to zero", it's "compare, only setting the zero/Z flag". It can either be equal-to-zero or not-equal-to-zero, and we weren't checking what sense it was.

If it's equal-to-zero, we can swap the operands around and pretend like it is not-equal-to-zero, which is both a bug fix and lets us handle more cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252891 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-12 13:49:17 +00:00
Diego Novillo
f2d61b3209 Properly fix unused variable in disable-assert builds.
I missed the side-effects of ParseBFI in my previous attempt (r252748).
Thanks dblaikie for the suggestion of adding a void use of the unused
variable instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252751 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-11 16:39:22 +00:00
Diego Novillo
afbec8e339 Remove unused variable in disable-assert builds. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252748 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-11 16:14:52 +00:00
James Molloy
5e49f41b8d [ARM] Combine BFIs together
If we have a chain of BFIs, we may be able to combine several together into one merged BFI. We can do this if the "from" bits from one BFI OR'd with the "from" bits from the other BFI form a contiguous range, and the same with the "to" bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252740 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-11 15:40:40 +00:00
Sanjay Patel
da103fa05e [ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2
implementation.

The net result of allowing this speculation for the regression tests in this patch is
that we get this code:

ctlz:               
  clz  r0, r0
  bx  lr
cttz:              
  rbit  r0, r0
  clz  r0, r0
  bx  lr

Instead of:

ctlz:    
  cmp  r0, #0
  moveq  r0, #32
  clzne  r0, r0
  bx  lr
cttz:     
  cmp   r0, #0
  moveq  r0, #32
  rbitne  r0, r0
  clzne  r0, r0
  bx  lr

This will help solve a general speculation/despeculation problem noted in PR24818:
https://llvm.org/bugs/show_bug.cgi?id=24818

Differential Revision: http://reviews.llvm.org/D14469



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252639 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-10 19:24:31 +00:00
James Molloy
20538780ae Reapply "[ARM] Combine CMOV into BFI where possible"
Added fixes for stage2 failures: CMOV is not commutable; commuting the operands results in the condition being flipped! d'oh!

Original commit message:

If we have a CMOV, OR and AND combination such as:
  if (x & CN)
      y |= CM;

And:
  * CN is a single bit;
    * All bits covered by CM are known zero in y;

Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252606 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-10 14:22:05 +00:00
Renato Golin
f60aec48e9 [EABI] Add LLVM support for -meabi flag
"GCC requires the freestanding environment provide memcpy, memmove, memset
and memcmp": https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Standards.html

Hence in GNUEABI targets LLVM should not convert 'memops' to their equivalent
'__aeabi_memops'. This convertion violates GCC contract.

The -meabi flag controls whether or not LLVM will modify 'memops' in GNUEABI
targets.

Without -meabi: use the triple default EABI.
With -meabi=default: use the triple default EABI.
With -meabi=gnu: use 'memops'.
With -meabi=4 or -meabi=5: use '__aeabi_memops'.
With -meabi set to an unknown value: same as -meabi=default.

Patch by Vinicius Tinti.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252462 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-09 12:40:30 +00:00
Renato Golin
a7fb0ca802 Revert "[ARM] Combine CMOV into BFI where possible"
This reverts commit r252057, as it broke ARM self-hosting buildbots, probably
due to a code-gen fault.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252460 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-09 12:19:10 +00:00
Joseph Tremoulet
de9bf0f80e [WinEH] Update exception pointer registers
Summary:
The CLR's personality routine passes these in rdx/edx, not rax/eax.

Make getExceptionPointerRegister a virtual method parameterized by
personality function to allow making this distinction.

Similarly make getExceptionSelectorRegister a virtual method parameterized
by personality function, for symmetry.


Reviewers: pgavlin, majnemer, rnk

Subscribers: jyknight, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D14344

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252383 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-07 01:11:31 +00:00
James Molloy
ccdfbf603c [ARM] Compute known bits for ARMISD::CMOV
We can conservatively know that CMOV's known bits are the intersection of known bits for each of its operands. This helps PerformCMOVToBFICombine find more opportunities.

I tried hard to create a testcase for this and failed - we have to sufficiently confuse DAG.computeKnownBits which can see through all the cheap tricks I tried to narrow my larger testcase down :(

This code is actually exercised in CodeGen/ARM/bfi.ll, there's just no functional difference because DAG.computeKnownBits gets the right answer in that case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252168 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-05 15:21:58 +00:00
James Molloy
447c9ea9e1 [ARM] Combine CMOV into BFI where possible
If we have a CMOV, OR and AND combination such as:
  if (x & CN)
    y |= CM;

And:
  * CN is a single bit;
  * All bits covered by CM are known zero in y;

Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252057 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-04 16:55:07 +00:00
Tim Northover
ed754ee4a7 ARM: add support for WatchOS's compact unwind information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251573 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-28 22:56:36 +00:00
Tim Northover
7b7ff9e152 ARM: teach backend about WatchOS and TvOS libcalls.
The most substantial changes are again for watchOS: libcalls are hard-float if
needed and sincos has a different calling convention.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251571 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-28 22:51:16 +00:00
Charlie Turner
aca09a9d4b [ARM] Expand ROTL and ROTR of vector value types
Summary: After D13851 landed, we saw backend crashes when compiling the reduced test case included in this patch. The right fix seems to be to allow these vector types for expansion in instruction selection.

Reviewers: rengolin, t.p.northover

Subscribers: RKSimon, t.p.northover, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14082

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251401 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-27 10:25:20 +00:00
Peter Collingbourne
624894b4d2 ARM/ELF: Restore original (pre-r251322) logic for deciding whether to use GOT.
Unbreaks linking with gold, which cannot resolve direct relocations referring
to global symbols.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251342 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-26 20:46:44 +00:00
Peter Collingbourne
7da5357aca ARM/ELF: Better codegen for global variable addresses.
In PIC mode we were previously computing global variable addresses (or GOT
entry addresses) by adding the PC, the PC-relative GOT displacement and
the GOT-relative symbol/GOT entry displacement. Because the latter two
displacements are fixed, we ended up performing one more addition than
necessary.

This change causes us to compute addresses using a single PC-relative
displacement, resulting in a shorter code sequence. This reduces code size
by about 4% in a recent build of Chromium for Android.

As a result of this change we no longer need to compute the GOT base address
in the ARM backend, which allows us to remove the Global Base Reg pass and
SDAG lowering for the GOT.

We also now no longer use the GOT when addressing a symbol which is known
to be defined in the same linkage unit. Specifically, the symbol must have
either hidden visibility or a strong definition in the current module in
order to not use the the GOT.

This is a change from the previous behaviour where we would use the GOT to
address externally visible symbols defined in the same module. I think the
only cases where this could matter are cases involving symbol interposition,
but we don't really support that well anyway.

Differential Revision: http://reviews.llvm.org/D13650

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251322 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-26 18:23:16 +00:00
Craig Topper
ed05a5b554 Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. This removes the need to pass a hardcoded size in many places. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251032 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-22 17:05:00 +00:00
Artyom Skrobov
190815e2ea Adding support for TargetLoweringBase::LibCall
Summary:
TargetLoweringBase::Expand is defined as "Try to expand this to other ops,
otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between
the two possibilities was defined in a rather convoluted way:

- if DIVREM is legal, expand to DIVREM
- if DIVREM has a custom lowering, expand to DIVREM
- if DIVREM libcall is defined and a remainder from the same division is
  computed elsewhere, expand to a DIVREM libcall
- else, expand to a DIV libcall

This had the undesirable effect that if both DIV and DIVREM are implemented
as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM
libcall, even when the remainder isn't used.

The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that
backends can directly control whether they prefer an expansion or a conversion
to a libcall. This makes the generic lowering code even more generic,
allowing its reuse in a wider range of target-specific configurations.

The useful effect is that ARM backend will now generate a call
to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where
it doesn't need the remainder. There's no functional change outside
the ARM backend.

Reviewers: t.p.northover, rengolin

Subscribers: t.p.northover, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D13862

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250826 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-20 13:14:52 +00:00
Duncan P. N. Exon Smith
974314ae03 ARM: Remove implicit ilist iterator conversions, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250759 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-19 23:25:57 +00:00
Craig Topper
44bf343ec1 Make a bunch of static arrays const.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250642 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-18 05:15:34 +00:00
Chad Rosier
cc78fcae6a [ARM] Promote helper function to SelectionDAG.
I'll be using the function in a similar combine for AArch64.  The helper was
also improved to handle undef values.

Part of http://reviews.llvm.org/D13442

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249572 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 17:28:58 +00:00
Oliver Stannard
21348cfda2 [ARM] Use correct half-precision functions in EABI mode
The ARM RTABI defines the half- to single-precision float conversion functions
with an __aeabi prefix, but libgcc only has them with a __gnu prefix. Therefore
we need to emit the __aeabi version when compiling with an eabi or eabihf
triple, and the __gnu version with a gnueabi or gnueabihf triple.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249565 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 16:58:49 +00:00
Chad Rosier
f33703d227 [ARM] Prevent PerformVDIVCombine from combining a vcvt/vdiv with 8 lanes.
This would result in a crash since the vcvt used does not support v8i32 types.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249560 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 16:15:40 +00:00
Jeroen Ketema
61dd176fb0 [ARM][AArch64] Only lower to interleaved load/store if the target has NEON
Without an additional check for NEON, the compiler crashes during
legalization of NEON ldN/stN.

Differential Revision: http://reviews.llvm.org/D13508


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249550 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 14:53:29 +00:00
Chad Rosier
1e68e2a33d [ARM] Push more complex check down to reduce compile time. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249547 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 13:40:44 +00:00
Chad Rosier
d851b2f286 [ARM] Minor refactoring. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249465 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-06 20:58:42 +00:00
Chad Rosier
86e10f92b5 [ARM] Minor refactoring. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249464 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-06 20:51:26 +00:00
Chad Rosier
ea139f1c91 [ARM] Minor refactoring. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249463 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-06 20:45:45 +00:00
Chad Rosier
399df14877 [ARM] Minor refactoring to improve readability. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249454 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-06 20:23:42 +00:00
Scott Douglass
ab6273be70 [ARM] Modify codegen for memcpy intrinsic to prefer LDM/STM.
We were previously codegen'ing memcpy as regular load/store operations and
hoping that the register allocator would allocate registers in ascending order
so that we could apply an LDM/STM combine after register allocation. According
to the commit that first introduced this code (r37179), we planned to teach the
register allocator to allocate the registers in ascending order. This never got
implemented, and up to now we've been stuck with very poor codegen.

A much simpler approach for achieving better codegen is to create MEMCPY pseudo
instructions, attach scratch virtual registers to them and then, post register
allocation, expand the MEMCPYs into LDM/STM pairs using the scratch registers.
The register allocator will have picked arbitrary registers which we sort when
expanding the MEMCPY. This approach also avoids the need to repeatedly calculate
offsets which ultimately ought to be eliminated pre-RA in order to decrease
register pressure.

Fixes PR9199 and PR23768.

[This is based on Peter Collingbourne's r238473 which was reverted.]

Differential Revision: http://reviews.llvm.org/D13239

Change-Id: I727543c2e94136e0f80b8e22d5642d7b9ee5b458
Author: Peter Collingbourne <peter@pcc.me.uk>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249322 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-05 14:49:54 +00:00
Jeroen Ketema
874371d11b [ARM][NEON] Use address space in vld([1234]|[234]lane) and vst([1234]|[234]lane) instructions
This commit changes the interface of the vld[1234], vld[234]lane, and vst[1234],
vst[234]lane ARM neon intrinsics and associates an address space with the
pointer that these intrinsics take. This changes, e.g.,

<2 x i32> @llvm.arm.neon.vld1.v2i32(i8*, i32)

to

<2 x i32> @llvm.arm.neon.vld1.v2i32.p0i8(i8*, i32)

This change ensures that address spaces are fully taken into account in the ARM
target during lowering of interleaved loads and stores.

Differential Revision: http://reviews.llvm.org/D12985



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248887 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-30 10:56:37 +00:00
Artyom Skrobov
c0ae278730 [ARM] Avoid redundant checks for isThumb1Only() after supportsTailCall()
supportsTailCall() has two callers. Both of them double-check isThumb1Only(),
and refuse to proceed with tail-calling in that case.
Therefore, it makes sense to move this check to
ARMSubtarget::initSubtargetFeatures, where SupportsTailCall is initialized;
and to eliminate the extra checks at the call sites.

Following a review comment, added an "assert(supportsTailCall())"
in IsEligibleForTailCall.

NFC.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248703 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28 09:44:11 +00:00
Ahmed Bougacha
8e43417d2f [ARM] Don't generate clrex for pre-v7 targets.
Since r248294, we emit clrex, but it doesn't exist on v6.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248640 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-26 00:14:02 +00:00
Saleem Abdulrasool
054da58c53 ARM: make -Asserts,-Werror=unused-variable build happy
The value was only used in an assertion.  Sink the variable usage into the
assertion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248562 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 05:41:02 +00:00
Saleem Abdulrasool
64ed61ca6b ARM: address WoA division limitation
We now emit the compiler generated divide by zero check that was needed for the
MSVC routines.  We construct a psuedo-instruction for the DBZ check as the
operation requires splitting up the BB.  For the 64-bit operations, we need to
custom expand the node as we need to insert the DBZ check and then emit the
libcall to the appropriate name.  Because this is target specific, it seemed
better to reproduce the expansion operation from the target-agnostic type
legalization rather than sink this there to avoid the duplication.  The division
library calls now match MSVC semantically.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248561 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 05:15:46 +00:00
Artyom Skrobov
c848236c93 [ARM] Handle +t2dsp feature as an ArchExtKind in ARMTargetParser.def
Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a
hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with
a FIXME: attached.

This patch changes the handling of +t2dsp to be in line with other
architecture extensions.

Following a revert of r248152 and new review comments, this patch also includes
renaming FeatureDSPThumb2 -> FeatureDSP, hasThumb2DSP() -> hasDSP(), etc.
The spelling of "t2dsp" is preserved, pending a further investigation of its
possible external usage.

Differential Revision: http://reviews.llvm.org/D12937



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248519 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 17:31:16 +00:00
Ahmed Bougacha
16a661561f [ARM] Emit clrex in the expanded cmpxchg fail block.
ARM counterpart to r248291:

In the comparison failure block of a cmpxchg expansion, the initial
ldrex/ldxr will not be followed by a matching strex/stxr.
On ARM/AArch64, this unnecessarily ties up the execution monitor,
which might have a negative performance impact on some uarchs.

Instead, release the monitor in the failure block.
The clrex instruction was designed for this: use it.

Also see ARMARM v8-A B2.10.2:
"Exclusive access instructions and Shareable memory locations".

Differential Revision: http://reviews.llvm.org/D13033

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248294 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 17:22:58 +00:00