1591 Commits

Author SHA1 Message Date
Christof Douma
5c21be0e40 [ARM] Make RWPI use movw/movt when available
When constructing global address literals while targeting the RWPI
relocation model. LLVM currently only uses literal pools. If MOVW/MOVT
instructions are available we can use these instead. Beside being more
efficient it allows -arm-execute-only to work with
-relocation-model=RWPI as well.

When we generate MOVW/MOVT for global addresses when targeting the RWPI
relocation model, we need to use base relative relocations. This patch
does the needed plumbing in MC to generate these for MOVW/MOVT.

Differential Revision: https://reviews.llvm.org/D29487

Change-Id: I446786e43a6f5aa9b6a5bb2cd216d60d41c7755d

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294298 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-07 13:07:12 +00:00
Reid Kleckner
b084c50312 [CodeGen] Remove dead call-or-prologue enum from CCState
This enum has been dead since Olivier Stannard re-implemented ARM byval
handling in r202985 (2014).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293943 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-02 21:58:22 +00:00
Matthew Simpson
aa96bc9c68 [LV] Move interleaved access helper functions to VectorUtils (NFC)
This patch moves some helper functions related to interleaved access
vectorization out of LoopVectorize.cpp and into VectorUtils.cpp. We would like
to use these functions in a follow-on patch that improves interleaved load and
store lowering in (ARM/AArch64)ISelLowering.cpp. One of the functions was
already duplicated there and has been removed.

Differential Revision: https://reviews.llvm.org/D29398

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293788 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-01 17:45:46 +00:00
Matthias Braun
88d207542b Cleanup dump() functions.
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html

For reference:
- Public headers should just declare the dump() method but not use
  LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
  #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
  LLVM_DUMP_METHOD void MyClass::dump() {
    // print stuff to dbgs()...
  }
  #endif

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293359 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-28 02:02:38 +00:00
Eugene Zelenko
88afd089cb [ARM] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293348 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-27 23:58:02 +00:00
Saleem Abdulrasool
611cb2c218 ARM: fix vectorized division on WoA
The Windows on ARM target uses custom division for normal division as
the backend needs to insert division-by-zero checks.  However, it is
designed to only handle non-vectorized division.  ARM has custom
lowering for vectorized division as that can avoid loading registers
with the values and invoke a division routine for each one, preferring
to lower using NEON instructions.  Fall back to the custom lowering for
the NEON instructions if we encounter a vectorized division.

Resolves PR31778!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293259 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-27 03:41:53 +00:00
Diana Picus
0f6ded417f [ARM] Use helpers for adding pred / CC operands. NFC
Hunt down some of the places where we use bare addReg(0) or addImm(AL).addReg(0)
and replace with add(condCodeOp()) and add(predOps()). This should make it
easier to understand what those operands represent (without having to look at
the definition of the instruction that we're adding to).

Differential Revision: https://reviews.llvm.org/D27984

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292587 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-20 08:15:24 +00:00
Saleem Abdulrasool
59f9c4e004 ARM: match GCC's behaviour for builtins
GCC changes the CC between the user-code and the builtins based on the
value of `-target` rather than `-mfloat-abi`.  When a HF target is used,
the VFP variant of the AAPCS CC is used.  Otherwise, the AAPCS variant
is used.  In all cases, the AEABI functions use the AAPCS CC.  Adjust
the calling convention based on the target.

Resolves PR30543!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291909 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-13 16:25:33 +00:00
Benjamin Kramer
1fb85c6675 Apply clang-tidy's performance-unnecessary-value-param to LLVM.
With some minor manual fixes for using function_ref instead of
std::function. No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291904 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-13 14:39:03 +00:00
Diana Picus
9ff5fb18b6 [ARM] CodeGen: Replace AddDefaultT1CC and AddNoT1CC. NFC
For AddDefaultT1CC, we add a new helper t1CondCodeOp, which creates the
appropriate register operand. For AddNoT1CC, we use the existing condCodeOp
helper - we only had two uses of AddNoT1CC, so at this point it's probably not
worth having yet another helper just for them.

Differential Revision: https://reviews.llvm.org/D28603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291894 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-13 10:37:37 +00:00
Diana Picus
1279248fc3 [ARM] CodeGen: Remove AddDefaultCC. NFC.
Replace all uses of AddDefaultCC with add(condCodeOp()).
The transformation has been done automatically with a custom tool based on Clang
AST Matchers + RefactoringTool.

Differential Revision: https://reviews.llvm.org/D28557

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291893 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-13 10:18:01 +00:00
Diana Picus
8a47810cd6 [CodeGen] Rename MachineInstrBuilder::addOperand. NFC
Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.

See https://reviews.llvm.org/D28057 for the whole discussion.

Differential Revision: https://reviews.llvm.org/D28556

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291891 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-13 09:58:52 +00:00
Diana Picus
0aacbaa851 [ARM] CodeGen: Remove AddDefaultPred. NFC.
Replace all uses of AddDefaultPred with MachineInstrBuilder::add(predOps()).
This makes the code building MachineInstrs more readable, because it allows us
to write code like:

MIB.addSomeOperand(blah)
   .add(predOps())
   .addAnotherOperand(blahblah)

instead of

AddDefaultPred(MIB.addSomeOperand(blah))
    .addAnotherOperand(blahblah)

This commit also adds the predOps helper in the ARM backend, as well as the add
method taking a variable number of operands to the MachineInstrBuilder.

The transformation has been done mostly automatically with a custom tool based
on Clang AST Matchers + RefactoringTool.

Differential Revision: https://reviews.llvm.org/D28555

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291890 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-13 09:37:56 +00:00
Saleem Abdulrasool
13f3144204 ARM: slightly more table driven libcall setup
Switch some additional library call setup to be table driven.  This
makes it more immediately obvious what the library call looks like.
This is important for ARM since the calling conventions for the builtins
change based on the target/libcall name.  NFC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291789 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-12 18:46:11 +00:00
Eli Friedman
1f1e5c29c3 [ARM] More aggressive matching for vpadd and vpaddl.
The new matchers work after legalization to make them simpler, and to avoid
blocking other optimizations.

Differential Revision: https://reviews.llvm.org/D27779



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291693 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-11 19:33:38 +00:00
Chad Rosier
ef64b17169 [ARM] Remove rbit intrinsics and autoupgrade to generic bitreverse.
Testing already covered by CodeGen/ARM/rbit.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291587 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-10 19:23:51 +00:00
Aaron Ballman
65985d046a Caught a simple typo. I do not know of a way to test this, but it seems like an unlikely thing to regress in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290757 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-30 15:57:56 +00:00
Eli Friedman
1e77c707b7 [ARM] Implement isExtractSubvectorCheap.
See https://reviews.llvm.org/D6678 for the history of
isExtractSubvectorCheap. Essentially the same considerations apply
to ARM.

This temporarily breaks the formation of vpadd/vpaddl in certain cases;
AddCombineToVPADDL essentially assumes that we won't form VUZP shuffles.
See https://reviews.llvm.org/D27779 for followup fix.

Differential Revision: https://reviews.llvm.org/D27774



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290198 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-20 20:05:07 +00:00
Eli Friedman
59768451f2 [ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently.
Currently, there are substantial problems forming vld1_dup even if the
VDUP survives legalization. The lack of an actual node
leads to terrible results: not only can we not form post-increment vld1_dup
instructions, but we form scalar pre-increment and post-increment
loads which force the loaded value into a GPR. This patch fixes that
by combining the vdup+load into an ARMISD node before DAGCombine
messes it up.

Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable).

Recommiting with fix to avoid forming vld1dup if the type of the load
doesn't match the type of the vdup (see
https://llvm.org/bugs/show_bug.cgi?id=31404).

Differential Revision: https://reviews.llvm.org/D27694



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289972 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 18:44:08 +00:00
Diana Picus
c82e7ec9e2 [ARM] Expose methods to get the CCAssignFn. NFCI
Add two public methods to ARMTargetLowering: CCAssignFnForCall and
CCAssignFnForReturn, which are just calling the already existing private method
CCAssignFnForNode. These will come in handy for GlobalISel on ARM.

We also replace all calls to CCAssignFnForNode in ARMISelLowering.cpp, because
the new methods are friendlier to the reader.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289932 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 10:35:20 +00:00
Nico Weber
56bffdd266 Revert 279703, it caused PR31404.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289923 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-16 04:51:25 +00:00
Prakhar Bahuguna
af672fb209 [ARM] Implement execute-only support in CodeGen
This implements execute-only support for ARM code generation, which
prevents the compiler from generating data accesses to code sections.
The following changes are involved:

* Add the CodeGen option "-arm-execute-only" to the ARM code generator.
* Add the clang flag "-mexecute-only" as well as the GCC-compatible
  alias "-mpure-code" to enable this option.
* When enabled, literal pools are replaced with MOVW/MOVT instructions,
  with VMOV used in addition for floating-point literals. As the MOVT
  instruction is required, execute-only support is only available in
  Thumb mode for targets supporting ARMv8-M baseline or Thumb2.
* Jump tables are placed in data sections when in execute-only mode.
* The execute-only text section is assigned section ID 0, and is
  marked as unreadable with the SHF_ARM_PURECODE flag with symbol 'y'.
  This also overrides selection of ELF sections for globals.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289784 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-15 07:59:08 +00:00
Eli Friedman
85e1264a86 [ARM] Split 128-bit vectors in BUILD_VECTOR lowering
Given that INSERT_VECTOR_ELT operates on D registers anyway, combining
64-bit vectors into a 128-bit vector is basically free. Therefore, try
to split BUILD_VECTOR nodes before giving up and lowering them to a series
of INSERT_VECTOR_ELT instructions. Sometimes this allows dramatically
better lowerings; see testcases for examples. Inspired by similar code
in the x86 backend for AVX.

Differential Revision: https://reviews.llvm.org/D27624



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289706 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-14 20:44:38 +00:00
Eli Friedman
c023f9db3f [ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently.
Currently, there are substantial problems forming vld1_dup even if the
VDUP survives legalization. The lack of an actual node
leads to terrible results: not only can we not form post-increment vld1_dup
instructions, but we form scalar pre-increment and post-increment
loads which force the loaded value into a GPR. This patch fixes that
by combining the vdup+load into an ARMISD node before DAGCombine
messes it up.

Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable).

Differential Revision: https://reviews.llvm.org/D27694



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289703 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-14 20:25:26 +00:00
Alina Sbirlea
068dd02393 Generalize strided store pattern in interleave access pass
Summary:
This patch aims to generalize matching of the strided store accesses to more general masks.
The more general rule is to have consecutive accesses based on the stride:
[x, y, ... z, x+1, y+1, ...z+1, x+2, y+2, ...z+2, ...]
All elements in the masks need not form a contiguous space, there may be gaps.
As before, undefs are allowed and filled in with adjacent element loads.

Reviewers: HaoLiu, mssimpso

Subscribers: mkuper, delena, llvm-commits

Differential Revision: https://reviews.llvm.org/D23646

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289573 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-13 19:32:36 +00:00
Matthias Braun
347847bcdc Move most EH from MachineModuleInfo to MachineFunction
Recommitting r288293 with some extra fixes for GlobalISel code.

Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.

This is a necessary step to have machine module passes work properly.

Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
  where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
  because the available MachineFunction pointers are const, but the code
  wants to call tidyLandingPads() in between
  (markFunctionEnd()/endFunction()).

Differential Revision: https://reviews.llvm.org/D27227

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288405 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-01 19:32:15 +00:00
Eric Christopher
e7b3959e01 Temporarily Revert "Move most EH from MachineModuleInfo to MachineFunction"
This apprears to have broken the global isel bot:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-globalisel_build/5174/console

This reverts commit r288293.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288322 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-01 07:50:12 +00:00
Matthias Braun
29c7b3a03e Move most EH from MachineModuleInfo to MachineFunction
Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.

This is a necessary step to have machine module passes work properly.

Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
  where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
  because the available MachineFunction pointers are const, but the code
  wants to call tidyLandingPads() in between
  (markFunctionEnd()/endFunction()).

Differential Revision: https://reviews.llvm.org/D27227

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288293 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-30 23:49:01 +00:00
Pablo Barrio
189d9909bf [ARM] Relax restriction on variadic functions for tailcall optimization
Summary:
Variadic functions can be treated in the same way as normal functions
with respect to the number and types of parameters.

Reviewers: grosbach, olista01, t.p.northover, rengolin

Subscribers: javed.absar, aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D26748

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287219 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-17 10:56:58 +00:00
Tim Northover
9769d9a0aa ARM: fix CodeGen for 64-bit shifts.
One half of the shifts obviously needed conditional selection based on whether
the shift amount is more than 32-bits, but leaving the other half as the
natural shift isn't acceptable either: it's undefined behaviour to shift a
32-bit value by more than 31.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287149 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-16 20:54:28 +00:00
Saleem Abdulrasool
10d3cd3f52 ARM: lower fpowi appropriately for Windows ARM
This handles the last case of the builtin function calls that we would
generate code which differed from Microsoft's ABI.  Rather than
generating a call to `__pow{d,s}i2` we now promote the parameter to a
float or double and invoke `powf` or `pow` instead.

Addresses PR30825!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286082 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-06 19:46:54 +00:00
Weiming Zhao
298455f377 [Cortex-M0] Atomic lowering
Summary: ARMv6m supports dmb etc fench instructions but not ldrex/strex etc. So for some atomic load/store, LLVM should inline instructions instead of lowering to __sync_ calls.

Reviewers: rengolin, efriedma, t.p.northover, jmolloy

Subscribers: efriedma, aemerson, llvm-commits

Differential Revision: https://reviews.llvm.org/D26120

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285969 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-03 21:49:08 +00:00
Saleem Abdulrasool
f185e91e2b CodeGen: further loosen -O0 CG for WoA division
Generate the slowest possible codepath for noopt CodeGen.  Even trying to be
clever with the negated jump can cause out-of-range jumps.  Use a wide branch
instead. Although the code is modelled simplistically, the later optimizations
would recombine the branching into `cbz` if possible.  This re-enables the
previous optimization as well as hopefully gives us working code in all cases.

Addresses PR30356!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285649 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-31 22:12:37 +00:00
Saleem Abdulrasool
b5143b06a1 ARM: ensure that the Windows DBZ check is in range
The Windows ARM target expects the compiler to emit a division-by-zero check.
The check would use the form of:

    cmp r?, #0
    cbz .Ltrap
    b .Lbody
  .Lbody:
    ...
  .Ltrap:
    udf #249 @ __brkdiv0

This works great most of the time.  However, if the body of the function is
greater than 127 bytes, the branch target limitation of cbz becomes an issue.
This occurs in the unoptimized code generation cases sometimes (like in
compiler-rt).

Since this is a matter of correctness, possibly pay a small penalty instead.  We
now form this slightly differently:

    cbnz .Lbody
    udf #249 @ __brkdiv0
  .Lbody:
    ...

The positive case is through the branch instead of being the next instruction.
However, because of the basic block layout, the negated branch is going to be
a short distance always (2 bytes away, after the inserted __brkdiv0).

The new t__brkdiv0 instruction is required to explicitly mark the instruction as
a terminator as the generic UDF instruction is not a terminator.

Addresses PR30532!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285312 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-27 16:59:22 +00:00
Sam Parker
a6ec572d31 [ARM] Predicate UMAAL selection on hasDSP.
UMAAL is a DSP instruction and it is not available on thumbv7m
(Cortex-M3) and thumbv6m (Cortex-M0+1) targets. Also fix wrong
CHECK prefix in longMAC.ll test.

Patch by Vadzim Dambrouski.

Differential Revision: https://reviews.llvm.org/D25890



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285278 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-27 09:47:10 +00:00
Eli Friedman
ed57153864 Improve ARM lowering for "icmp <2 x i64> eq".
The custom lowering is pretty straightforward: basically, just AND
together the two halves of a <4 x i32> compare.

Differential Revision: https://reviews.llvm.org/D25713



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284536 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 21:03:40 +00:00
Javed Absar
172d6c078b [ARM]: Assign cost of scaling used in addressing mode for ARM cores
This patch assigns cost of the scaling used in addressing.
On many ARM cores, a negated register offset takes longer than a
non-negated register offset, in a register-offset addressing mode.

For instance:

LDR R0, [R1, R2 LSL #2]
LDR R0, [R1, -R2 LSL #2]

Above, (1) takes less cycles than (2).

By assigning appropriate scaling factor cost, we enable the LLVM
to make the right trade-offs in the optimization and code-selection phase.

Differential Revision: http://reviews.llvm.org/D24857

Reviewers: jmolloy, rengolin




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284127 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 14:57:43 +00:00
Oliver Stannard
f903c00341 [ARM] Fix registers clobbered by SjLj EH on soft-float targets
Currently, the Int_eh_sjlj_dispatchsetup intrinsic is marked as
clobbering all registers, including floating-point registers that may
not be present on the target. This is technically true, as we could get
linked against code that does use the FP registers, but that will not
actually work, as the soft-float code cannot save and restore the FP
registers. SjLj exception handling can only work correctly if either all
or none of the code is built for a target with FP registers. Therefore,
we can assume that, when Int_eh_sjlj_dispatchsetup is compiled for a
soft-float target, it is only going to be linked against other
soft-float code, and so only clobbers the general-purpose registers.
This allows us to check that no non-savable registers are clobbered when
generating the prologue/epilogue.

Differential Revision: https://reviews.llvm.org/D25180



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283866 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 10:06:59 +00:00
Martin Storsjo
c3339a54ac [ARM] Reapply: Use __rt_div functions for divrem on Windows
Reapplying r283383 after revert in r283442. The additional fix
is a getting rid of a stray space in a function name, in the
refactoring part of the commit.

This avoids falling back to calling out to the GCC rem functions
(__moddi3, __umoddi3) when targeting Windows.

The __rt_div functions have flipped the two arguments compared
to the __aeabi_divmod functions. To match MSVC, we emit a
check for division by zero before actually calling the library
function (even if the library function itself also might do
the same check).

Not all calls to __rt_div functions for division are currently
merged with calls to the same function with the same parameters
for the remainder. This is more wasteful than a div + mls as before,
but avoids calls to __moddi3.

Differential Revision: https://reviews.llvm.org/D25332

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283550 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-07 13:28:53 +00:00
Diana Picus
44f7288034 Revert "[ARM] Use __rt_div functions for divrem on Windows"
This reverts commit r283383 because it broke some of the bots:
undefined reference to ` __aeabi_uldivmod'

It affected (at least) clang-cmake-armv7-a15-selfhost,
clang-cmake-armv7-a15-selfhost and clang-native-arm-lnt.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283442 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-06 11:24:29 +00:00
James Molloy
3747be0f9d [ARM] Constant pool promotion - fix alignment calculation
Global variables are GlobalValues, so they have explicit alignment. Querying
DataLayout for the alignment was incorrect.

Testcase added.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283423 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-06 07:56:00 +00:00
Martin Storsjo
8c96de4693 [ARM] Use __rt_div functions for divrem on Windows
This avoids falling back to calling out to the GCC rem functions
(__moddi3, __umoddi3) when targeting Windows.

The __rt_div functions have flipped the two arguments compared
to the __aeabi_divmod functions. To match MSVC, we emit a
check for division by zero before actually calling the library
function (even if the library function itself also might do
the same check).

Not all calls to __rt_div functions for division are currently
merged with calls to the same function with the same parameters
for the remainder. This is more wasteful than a div + mls as before,
but avoids calls to __moddi3.

Differential Revision: https://reviews.llvm.org/D24076

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283383 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-05 21:08:02 +00:00
Sjoerd Meijer
ec44dc9079 [ARM] Code size optimisation to lower udiv+urem to udiv+mls instead of a
library call to __aeabi_uidivmod. This is an improved implementation of
r280808, see also D24133, that got reverted because isel was stuck in a loop.
That was caused by the optimisation incorrectly triggering on i64 ints, which
shouldn't happen because there is no 64bit hwdiv support; that put isel's type
legalization and this optimisation in a loop. A native ARM compiler and testing
now shows that this is fixed.

Patch mostly by Pablo Barrio.

Differential Revision: https://reviews.llvm.org/D25077


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283098 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-03 10:12:32 +00:00
James Molloy
ba54dc2e88 [ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:

      ldr r0, .CPI0
      bl printf
      bx lr
    .CPI0: &format_string
    format_string: .asciz "hello, world!\n"

We can emit:

      adr r0, .CPI0
      bl printf
      bx lr
    .CPI0: .asciz "hello, world!\n"

This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).

This recommit contains fixes for a nasty bug related to fast-isel fallback - because
fast-isel doesn't know about this optimization, if it runs and emits references to
a string that we inline (because fast-isel fell back to SDAG) we will end up
with an inlined string and also an out-of-line string, and we won't emit the
out-of-line string, causing backend failures.

It also contains fixes for emitting .text relocations which made the sanitizer
bots unhappy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282387 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-26 07:26:24 +00:00
James Molloy
3b62127542 Revert "[ARM] Promote small global constants to constant pools"
This reverts commit r282241. It caused http://lab.llvm.org:8011/builders/clang-native-arm-lnt/builds/19882.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282249 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-23 13:35:43 +00:00
James Molloy
e2c1cbe138 [ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:

      ldr r0, .CPI0
      bl printf
      bx lr
    .CPI0: &format_string
    format_string: .asciz "hello, world!\n"

We can emit:

      adr r0, .CPI0
      bl printf
      bx lr
    .CPI0: .asciz "hello, world!\n"

This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).

This recommit contains fixes for a nasty bug related to fast-isel fallback - because
fast-isel doesn't know about this optimization, if it runs and emits references to
a string that we inline (because fast-isel fell back to SDAG) we will end up
with an inlined string and also an out-of-line string, and we won't emit the
out-of-line string, causing backend failures.

It also contains fixes for emitting .text relocations which made the sanitizer
bots unhappy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282241 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-23 12:15:58 +00:00
Nico Weber
f83ecb0d7e Revert r281715, it caused PR30475
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282076 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-21 15:33:24 +00:00
Sjoerd Meijer
ea7e0cd04b Reverting r281719, this is causing buildbot failures and timeouts again.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281722 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-16 13:16:52 +00:00
Sjoerd Meijer
f9c0a77050 This is an attempt to reapply r280808: [ARM] Lower UDIV+UREM to UDIV+MLS
(and the same for SREM)

This was causing buildbot failures earlier (time outs in the LNT suite).
However, we haven't been able to reproduce this and are suspecting this
was caused by another (reverted) patch.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281719 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-16 12:10:09 +00:00
James Molloy
d95bddac05 [ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:

      ldr r0, .CPI0
      bl printf
      bx lr
    .CPI0: &format_string
    format_string: .asciz "hello, world!\n"

We can emit:

      adr r0, .CPI0
      bl printf
      bx lr
    .CPI0: .asciz "hello, world!\n"

This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).

This recommit contains fixes for a nasty bug related to fast-isel fallback - because
fast-isel doesn't know about this optimization, if it runs and emits references to
a string that we inline (because fast-isel fell back to SDAG) we will end up
with an inlined string and also an out-of-line string, and we won't emit the
out-of-line string, causing backend failures.

It also contains fixes for emitting .text relocations which made the sanitizer
bots unhappy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281715 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-16 10:17:04 +00:00