Commit Graph

2456 Commits

Author SHA1 Message Date
Akira Hatanaka
516286ff69 Use function attribute "trap-func-name" and remove TargetOptions::TrapFuncName.
This commit changes normal isel and fast isel to read the user-defined trap
function name from function attribute "trap-func-name" attached to llvm.trap or
llvm.debugtrap instead of from TargetOptions::TrapFuncName. This is needed to
use clang's command line option "-ftrap-function" for LTO and enable changing
the trap function name on a per-call-site basis.

Out-of-tree projects currently using TargetOptions::TrapFuncName to specify the
trap function name should attach attribute "trap-func-name" to the call sites
of llvm.trap and llvm.debugtrap instead.

rdar://problem/21225723

Differential Revision: http://reviews.llvm.org/D10832


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241305 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-02 22:13:27 +00:00
Tim Northover
9cbdfb5c05 ARM: add correct kill flags when combining stm instructions
When the store sequence being combined actually stores the base register, we
should not mark it as killed until the end.

rdar://21504262

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241003 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-29 21:42:16 +00:00
Javed Absar
263dd533ee [ARM]: Extend -mfpu options for half-precision and vfpv3xd
Some of the the permissible ARM -mfpu options, which are supported in GCC,
are currently not present in llvm/clang.This patch adds the options:
'neon-fp16', 'vfpv3-fp16', 'vfpv3-d16-fp16', 'vfpv3xd' and 'vfpv3xd-fp16.
These are related to half-precision floating-point and single precision.

Reviewers: rengolin, ranjeet.singh

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10645



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240930 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-29 09:32:29 +00:00
Javed Absar
d105e18ab6 [ARM] Cortex-R5 is not VFPOnlySP
This patch fixes the error in ARM.td which stated that Cortex-R5
floating point unit can do only single precision, when it can do double as well.

Reviewers: rengolin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10769



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240799 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-26 17:42:37 +00:00
Javed Absar
5511d97506 [ARM] Cortex-R4F is not VFPOnlySP
Cortex-R4F TRM states that fpu supports both single and double precision.
This patch corrects the information in ARM.td file and corresponding test.

Reviewers: rengolin

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10763



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240776 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-26 12:14:56 +00:00
Hao Liu
5be64c403b [ARM] Lower interleaved memory accesses to vldN/vstN intrinsics.
This patch also adds a function to calculate the cost of interleaved memory accesses.

E.g. Lower an interleaved load:
        %wide.vec = load <8 x i32>, <8 x i32>* %ptr, align 4
        %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>
        %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>
     into:
        %vld2 = { <4 x i32>, <4 x i32> } call llvm.arm.neon.vld2(%ptr, 4)
        %vec0 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 0
        %vec1 = extractelement { <4 x i32>, <4 x i32> } %vld2, i32 1

E.g. Lower an interleaved store:
        %i.vec = shuffle <8 x i32> %v0, <8 x i32> %v1, <0, 4, 8, 1, 5, 9, 2, 6, 10, 3, 7, 11>
        store <12 x i32> %i.vec, <12 x i32>* %ptr, align 4
     into:
        %sub.v0 = shuffle <8 x i32> %v0, <8 x i32> v1, <0, 1, 2, 3>
        %sub.v1 = shuffle <8 x i32> %v0, <8 x i32> v1, <4, 5, 6, 7>
        %sub.v2 = shuffle <8 x i32> %v0, <8 x i32> v1, <8, 9, 10, 11>
        call void llvm.arm.neon.vst3(%ptr, %sub.v0, %sub.v1, %sub.v2, 4)

Differential Revision: http://reviews.llvm.org/D10533


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240755 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-26 02:45:36 +00:00
Matthias Braun
5056381b75 Fix mismatched architectures in test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240745 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-26 00:26:46 +00:00
Matthias Braun
7d46df3626 ARMLoadStoreOptimizer: Fix errata 602117 handling and make testcase actually test for it
This fixes PR23912

Differential Revision: http://reviews.llvm.org/D10620

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240582 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-24 20:03:27 +00:00
Ahmed Bougacha
ac655060b4 [ARM] Look through concat when lowering in-place shuffles (VZIP, ..)
Currently, we canonicalize shuffles that produce a result larger than
their operands with:
  shuffle(concat(v1, undef), concat(v2, undef))
->
  shuffle(concat(v1, v2), undef)

because we can access quad vectors (see PerformVECTOR_SHUFFLECombine).

This is useful in the general case, but there are special cases where
native shuffles produce larger results: the two-result ops.

We can look through the concat when lowering them:
  shuffle(concat(v1, v2), undef)
->
  concat(VZIP(v1, v2):0, :1)

This lets us generate the native shuffles instead of scalarizing to
dozens of VMOVs.

Differential Revision: http://reviews.llvm.org/D10424


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240118 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-19 02:32:35 +00:00
Ahmed Bougacha
2120f5332b [ARM] Add D-sized vtrn/vuzp/vzip tests, and cleanup. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240114 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-19 02:15:34 +00:00
Eric Christopher
933d2bd391 Fix "the the" in comments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@240112 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-19 01:53:21 +00:00
David Majnemer
cc714e2142 Move the personality function from LandingPadInst to Function
The personality routine currently lives in the LandingPadInst.

This isn't desirable because:
- All LandingPadInsts in the same function must have the same
  personality routine.  This means that each LandingPadInst beyond the
  first has an operand which produces no additional information.

- There is ongoing work to introduce EH IR constructs other than
  LandingPadInst.  Moving the personality routine off of any one
  particular Instruction and onto the parent function seems a lot better
  than have N different places a personality function can sneak onto an
  exceptional function.

Differential Revision: http://reviews.llvm.org/D10429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239940 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-17 20:52:32 +00:00
John Brawn
14d0411acb [ARM] Disabling vfp4 should disable fp16
ARMTargetParser::getFPUFeatures should disable fp16 whenever it
disables vfp4, as otherwise something like -mcpu=cortex-a7 -mfpu=none
leaves us with fp16 enabled (though the only effect that will have is
a wrong build attribute).

Differential Revision: http://reviews.llvm.org/D10397


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239599 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-12 09:38:51 +00:00
NAKAMURA Takumi
b1337aba53 Add explicit -mtriple=arm-unknown to llvm/test/CodeGen/ARM/disable-tail-calls.ll, to satisfy *-win32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239442 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 23:33:25 +00:00
Akira Hatanaka
0e3246a86f Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions
that was resetting it.

Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis. 
 
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.

Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.

rdar://problem/13752163

Differential Revision: http://reviews.llvm.org/D10099


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239427 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 19:07:19 +00:00
Akira Hatanaka
fa6bc2e94d [ARM] Pass a callback to FunctionPass constructors to enable skipping execution
on a per-function basis.

Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.

rdar://problem/20542263

Differential Revision: http://reviews.llvm.org/D8717


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239325 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 18:50:43 +00:00
John Brawn
272d7fdf42 [ARM] Add support for -sp- FPUs and FPU none to TargetParser
These are added mainly for the benefit of clang, but this also means that they
are now allowed in .fpu directives and we emit the correct .fpu directive when
single-precision-only is used.

Differential Revision: http://reviews.llvm.org/D10238


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239151 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-05 13:31:19 +00:00
Matthias Braun
e942914d29 ARM: Thumb2 LDRD/STRD supports independent input/output regs
The existing code would unnecessarily break LDRD/STRD apart with
non-adjacent registers, on thumb2 this is not necessary.

Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore
as there is not reason to set register hints anymore, changing that is
something for a future patch however.

Differential Revision: http://reviews.llvm.org/D9694

Recommiting after the revert in r238821, the buildbot still failed with
the patch removed so there seems to be another reason for the breakage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238935 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 16:30:24 +00:00
Renato Golin
6b35bec8ef Revert "ARM: Thumb2 LDRD/STRD supports independent input/output regs"
This reverts commit r238795, as it broke the Thumb2 self-hosting buildbot.

Since self-hosting issues with Clang are hard to investigate, I'm taking the
liberty to revert now, so we can investigate it offline.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238821 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 11:47:30 +00:00
Matthias Braun
d421582e90 ARM: Thumb2 LDRD/STRD supports independent input/output regs
The existing code would unnecessarily break LDRD/STRD apart with
non-adjacent registers, on thumb2 this is not necessary.

Ideally on thumb2 we shouldn't match for ldrd/strd pre-regalloc anymore
as there is not reason to set register hints anymore, changing that is
something for a future patch however.

Differential Revision: http://reviews.llvm.org/D9694

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238795 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 23:27:08 +00:00
Luke Cheeseman
7d97fc4164 Re-commit of r238201 with fix for building with shared libraries.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238739 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-01 12:02:47 +00:00
Tim Northover
876dd978b8 ARM: recommit r237590: allow jump tables to be placed as constant islands.
The original version didn't properly account for the base register
being modified before the final jump, so caused miscompilations in
Chromium and LLVM. I've fixed this and tested with an LLVM self-host
(I don't have the means to build & test Chromium).

The general idea remains the same: in pathological cases jump tables
can be too far away from the instructions referencing them (like other
constants) so they need to be movable.

Should fix PR23627.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238680 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-31 19:22:07 +00:00
Diego Novillo
9c24c958f1 Revert "Re-commit changes in r237579 with fix for bug breaking windows builds."
This reverts commit r238201 to fix linking problems in x86 Linux
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150525/278413.html

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238223 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-26 17:45:38 +00:00
Luke Cheeseman
262e24f7af Re-commit changes in r237579 with fix for bug breaking windows builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238201 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-26 13:40:31 +00:00
Akira Hatanaka
01461204b3 Stop resetting NoFramePointerElim in TargetMachine::resetTargetOptions.
This is part of the work to remove TargetMachine::resetTargetOptions.

In this patch, instead of updating global variable NoFramePointerElim in
resetTargetOptions, its use in DisableFramePointerElim is replaced with a call
to TargetFrameLowering::noFramePointerElim. This function determines on a
per-function basis if frame pointer elimination should be disabled.

There is no change in functionality except that cl:opt option "disable-fp-elim"
can now override function attribute "no-frame-pointer-elim". 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238080 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-23 01:14:08 +00:00
Peter Collingbourne
66811d9817 Revert r237590, "ARM: allow jump tables to be placed as constant islands."
Caused a miscompile of the Android port of Chromium, details
forthcoming.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237972 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-21 23:20:55 +00:00
Davide Italiano
e677c7bd22 [Target/ARM] Only enable OptimizeBarrierPass at -O1 and above.
Ideally this is going to be and LLVM IR pass (shared, among others
with AArch64), but for the time being just enable it if consumers
ask us for optimization and not unconditionally.

Discussed with Tim Northover on IRC.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237837 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-20 21:40:38 +00:00
Tim Northover
9962fd0e2e ARM: allow jump tables to be placed as constant islands.
Previously, they were forced to immediately follow the actual branch
instruction. This was usually OK (the LEAs actually accessing them got emitted
nearby, and weren't usually separated much afterwards). Unfortunately, a
sufficiently nasty phi elimination dumps many instructions right before the
basic block terminator, and this can increase the range too much.

This patch frees them up to be placed as usual by the constant islands pass,
and consequently has to slightly modify the form of TBB/TBH tables to refer to
a PC-relative label at the final jump. The other jump table formats were
already position-independent.

rdar://20813304

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237590 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-18 17:10:40 +00:00
Oliver Stannard
0139af335f Revert r237579, as it broke windows buildbots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237583 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-18 16:39:16 +00:00
Oliver Stannard
d811b4bacb [LLVM - ARM/AArch64] Add ACLE special register intrinsics
This patch implements LLVM support for the ACLE special register intrinsics in
section 10.1, __arm_{w,r}sr{,p,64}.

This patch is intended to lower the read/write_register instrinsics, used to
implement the special register intrinsics in the clang patch for special
register intrinsics (see http://reviews.llvm.org/D9697), to ARM specific
instructions MRC,MCR,MSR etc. to allow reading an writing of coprocessor
registers in AArch32 and AArch64. This is done by inspecting the register
string passed to the intrinsic and then lowering to the appropriate
instruction.

Patch by Luke Cheeseman.

Differential Revision: http://reviews.llvm.org/D9699



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237579 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-18 16:23:33 +00:00
Ahmed Bougacha
d8b3f0d785 [CodeGen] Use standard -not gnueabi- naming for f16 libcalls on Darwin.
Other targets probably should as well.  Since r237161, compiler-rt has
both, but I don't see why anything other than gnueabi would use a
gnueabi naming scheme.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237324 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-14 01:00:51 +00:00
Saleem Abdulrasool
82550fed5c CodeGen: ignore DEBUG_VALUE nodes in KILL tagging
DEBUG_VALUE nodes do not take part in code generation.  Ignore them when
performing KILL updates.  Addresses PR23486.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237211 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 23:36:18 +00:00
Sunil Srivastava
561c44fc33 Changed renaming of local symbols by inserting a dot vefore the numeric suffix.
One code change and several test changes to match that
details in http://reviews.llvm.org/D9481


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237150 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 16:47:30 +00:00
John Brawn
e38f45effc [ARM] Use AEABI aligned function variants
AEABI defines aligned variants of memcpy etc. that can be faster than
the default version due to not having to do alignment checks. When
emitting target code for these functions make use of these aligned
variants if possible. Also convert memset to memclr if possible.

Differential Revision: http://reviews.llvm.org/D8060


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237127 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 13:13:38 +00:00
Eric Christopher
0552d51c45 Migrate existing backends that care about software floating point
to use the information in the module rather than TargetOptions.

We've had and clang has used the use-soft-float attribute for some
time now so have the backends set a subtarget feature based on
a particular function now that subtargets are created based on
functions and function attributes.

For the one middle end soft float check go ahead and create
an overloadable TargetLowering::useSoftFloat function that
just checks the TargetSubtargetInfo in all cases.

Also remove the command line option that hard codes whether or
not soft-float is set by using the attribute for all of the
target specific test cases - for the generic just go ahead and
add the attribute in the one case that showed up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237079 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 01:26:05 +00:00
Pete Cooper
f5b930b2e2 [Fast-ISel] Don't mark the first use of a remat constant as killed.
When emitting something like 'add x, 1000' if we remat the 1000 then we should be able to
mark the vreg containing 1000 as killed.  Given that we go bottom up in fast-isel, a later
use of 1000 will be higher up in the BB and won't kill it, or be impacted by the lower kill.

However, rematerialised constant expressions aren't generated bottom up.  The local value save area
grows downwards.  This means that if you remat 2 constant expressions which both use 1000 then the
first will kill it, then the second, which is *lower* in the BB will read a killed register.

This is the case in the attached test where the 2 GEPs both need to generate 'add x, 6680' for the constant offset.

Note that this commit only makes kill flag generation conservative.  There's nothing else obviously wrong with
the local value save area growing downwards, and in fact it needs to for handling arbitrarily complex constant expressions.

However, it would be nice if there was a solution which would let us generate more accurate kill flags, or just kill flags completely.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236922 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-09 00:51:03 +00:00
Arnold Schwaighofer
75e36e847e ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not non-aliasing distinct objects
The code that builds the dependence graph assumes that two PseudoSourceValues
don't alias. In a tail calling function two FixedStackObjects might refer to the
same location. Worse 'immutable' fixed stack objects like function arguments are
not immutable and will be clobbered.

Change this so that a load from a FixedStackObject is not invariant in a tail
calling function and don't return a PseudoSourceValue for an instruction in tail
calling functions when building the dependence graph so that we handle function
arguments conservatively.

Fix for PR23459.

rdar://20740035

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236916 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 23:52:00 +00:00
Pete Cooper
f9f04c25b2 [Fast-ISel] Clear kill flags on registers replaced by updateValueMap.
When selecting an extract instruction, we don't actually generate code but instead work out which register we are reading, and rewrite uses of the extract def to the source register.  This is done via updateValueMap,.

However, its possible that the source register we are rewriting *to* to also have uses.  If those uses are after a kill of the value we are rewriting *from* then we have uses after a kill and the verifier fails.

This code checks for the case where the to register is also used, and if so it clears all kill on the from register.  This is conservative, but better that always clearing kills on the from register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236897 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-08 20:46:54 +00:00
Pete Cooper
e4eff4b231 Clear kill flags in tail duplication.
If we duplicate an instruction then we must also clear kill flags on any uses we rewrite.
Otherwise we might be killing a register which was used in other BBs.

For example, here the entry BB ended up with these instructions, the ADD having been tail duplicated.

	%vreg24<def> = t2ADDri %vreg10<kill>, 1, pred:14, pred:%noreg, opt:%noreg; GPRnopc:%vreg24 rGPR:%vreg10
	%vreg22<def> = COPY %vreg10; GPR:%vreg22 rGPR:%vreg10

	The copy here is inserted after the add and so needs vreg10 to be live.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236782 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-07 21:48:26 +00:00
Pete Cooper
28b0dda32e Handle dead defs in the if converter.
We had code such as this:
  r2 = ...
  t2Bcc

label1:
  ldr ... r2

label2;
  return r2<dead, def>

The if converter was transforming this to
   r2<def> = ...
   return [pred] r2<dead,def>
   ldr <r2, kill>
   return

which fails the machine verifier because the ldr now reads from a dead def.

The fix here detects dead defs in stepForward and passes them back to the caller in the clobbers list.  The caller then clears the dead flag from the def is the value is live.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236660 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 22:51:04 +00:00
Pete Cooper
537ff782aa Fix incorrect kill flags in fastisel.
If called twice in the same BB on the same constant, FastISel::fastEmit_ri_ was marking the materialized vreg as killed on each use, instead of only the last use.

Change this to only mark the last use as killed by making earlier uses check if the vreg is already used elsewhere.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236650 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 22:09:29 +00:00
Pete Cooper
99413f0d40 [ARM] Fast-Isel was incorrectly selecting <2 x double> adds.
With neon enabled, we reach SelectBinaryFPOp and are able to get registers for a <2 x double> add.

However, we shouldn't actually attempt arithmetic on it as ARMIselLowering says "v2f64 is legal so that QR subregs can be extracted as f64 elements, but neither Neon nor VFP support any arithmetic operations on it."

This commit disables SelectBinaryFPOp for any vector types.  There's already a FIXME to try handle neon.  Doing so would require fixing this conditional which isn't safe for vectors 'VT == MVT::f64 || VT == MVT::i64'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236609 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 16:39:17 +00:00
Artyom Skrobov
78fc2103c9 [ARM] generate VMAXNM/VMINNM for a compare followed by a select, in safe math mode too
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236590 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 11:44:10 +00:00
Ahmed Bougacha
caa560cfb9 [ARM][FastISel] Use TST #1 instead of CMP #0 for select.
Since r234249, i1 are sext instead of zext; because of that, doing
"CMP rN, #0; IT EQ/NE" isn't correct anymore.

"TST #1" is the conservatively correct alternative - the tradeoff being
that it doesn't have a 16-bit encoding -, so use that instead.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236569 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-06 04:14:02 +00:00
Pete Cooper
e064ab5798 Fix IfConverter to handle regmask machine operands.
Note, this is a recommit of r236515 after fixing an error in r236514.  The buildbot ran fast enough that it picked up r236514 prior to r236515 and threw an error.  r236515 itself ran 'make check' without errors.

Original commit message follows:

A regmask (typically seen on a call) clobbers the set of registers it lists.  The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks.

These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier.  Otherwise, uses after the if converted call could think they are reading an undefined register.

Reviewed by Matthias Braun and Quentin Colombet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236550 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-05 22:09:41 +00:00
Pete Cooper
5ffc7bfc9a Revert "Fix IfConverter to handle regmask machine operands."
This reverts commit b27413cbfd78d959c18e713bfa271fb69e6b3303 (ie r236515).

This is to get the bots green while i investigate the failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236517 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-05 18:49:05 +00:00
Pete Cooper
92a55e80b8 Fix IfConverter to handle regmask machine operands.
A regmask (typically seen on a call) clobbers the set of registers it lists.  The IfConverter, in UpdatePredRedefs, was handling register defs, but not regmasks.

These are slightly different to a def in that we need to add both an implicit use and def to appease the machine verifier.  Otherwise, uses after the if converted call could think they are reading an undefined register.

Reviewed by Matthias Braun and Quentin Colombet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236515 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-05 18:31:36 +00:00
Peter Collingbourne
dc1a030aec ARM: Align functions containing Thumb-2 jump tables to 4 bytes.
Functions with jump tables need an alignment of 4 because they use the ADR
instruction, which aligns the PC to 4 bytes before adding an offset.

Differential Revision: http://reviews.llvm.org/D9424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236327 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-01 18:05:59 +00:00
Quentin Colombet
d57216e763 [ARM][TEST] Strengthen test against smarter reg alloc.
Follow-up of r236247.

rdar://problem/20770899


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236296 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-01 00:45:55 +00:00
Pete Cooper
9ff32d2fe5 [ARM] optimizeSelect should clear kill flags.
If we move an instruction from one block down to a MOVC and predicate it,
then the original instruction could be moved in to a loop.  In this case,
its invalid for any kill flags to remain on there.

Fails with -verfy-machineinstrs.

rdar://problem/20752113

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236290 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 23:57:47 +00:00
Pete Cooper
a4f66d25fc Commute the internal flag on MachineOperands.
When commuting a thumb instruction in the size reduction pass, thumb
instructions are represented as a bundle and so some operands may be marked
as internal.  The internal flag has to move with the operand when commuting.

This test is sensitive to register allocation so can't specifically check that
this error was happening, but so long as it continues to pass with -verify then
hopefully its still ok.

rdar://problem/20752113

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236282 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 23:14:14 +00:00
Pete Cooper
d47066e86b Don't always apply kill flag in thumb2 ABS pseudo expansion.
The expansion for t2ABS was always setting the kill flag on the rsb instruction.
It should instead only be set on rsb if it was set on the original ABS instruction.

rdar://problem/20752113

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236272 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 22:15:59 +00:00
Quentin Colombet
131da40ffd [ARM] Do not generate invalid encoding for stack adjust, even if this is just
temporary.

Because of that:
1. The machine verifier was complaining on such code.
2. The generate code worked just because the thumb reduction size pass fixed the
opcode.

rdar://problem/20749824


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236247 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-30 18:52:49 +00:00
Duncan P. N. Exon Smith
e56023a059 IR: Give 'DI' prefix to debug info metadata
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`.  The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.

Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one.  It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs.  YMMV of
course.

Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py.  I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three.  It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).

Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236120 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 16:38:44 +00:00
Tim Northover
9f7d13868a ARM: fix peephole optimisation of TST
We were trying to look through COPY instructions, but only to the next
instruction in a BB and incorrectly anyway. The cases where that would actually
be a good idea are rare enough (and not even tested!) that it's not worth
trying to get right.

rdar://20721342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236050 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 22:03:55 +00:00
Hans Wennborg
b176a4f2e4 Switch lowering: Take branch weight into account when ordering for fall-through
Previously, the code would try to put a fall-through case last,
even if that meant moving a case with much higher branch weight
further down the chain.

Ordering by branch weight is most important, putting a fall-through
block last is secondary.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235942 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 23:35:22 +00:00
Peter Collingbourne
1ad0f74155 ARM: When re-creating a branch via InsertBranch, preserve CPSR flags.
In particular, this preserves the kill flag, which allows the Thumb2 cbn?z
optimization to be applied in cases where a branch has been re-created after
the live variables analysis pass, e.g. by the machine block placement pass.

This appears to be low risk; a number of other targets seem to already be
doing something similar, e.g. AArch64, PowerPC.

Differential Revision: http://reviews.llvm.org/D9184

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235639 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:32 +00:00
Peter Collingbourne
f86c29ea2c ARM: When spilling extra registers for alignment, prefer low registers on all Thumb targets.
This makes it more likely that we can use the 16-bit push and pop instructions
on Thumb-2, saving around 4 bytes per function.

Differential Revision: http://reviews.llvm.org/D9165

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235637 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:26 +00:00
Peter Collingbourne
b28abbf98b ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools.
This appears to have been introduced back in r76698 as part of an unrelated
change. I can find no official ARM documentation stating that Thumb-2 functions
require 4-byte alignment; in fact, ARM documentation appears to contradict
this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement,
section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions.").

Also remove code that sets alignment for ARM functions, which is redundant
with code in the MachineFunction constructor, and remove the hidden
-arm-align-constant-islands flag, which has been enabled by default since
r146739 (Dec 2011) and has probably received sufficient testing by now.

Differential Revision: http://reviews.llvm.org/D9138

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235636 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:22 +00:00
Hans Wennborg
defaf830f9 Re-commit r235560: Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
Third time's the charm. The previous commit was reverted as a
reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--'
on an iterator at the beginning of a vector, causing asserts
when using debugging iterators. This commit fixes that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235608 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 16:45:24 +00:00
Aaron Ballman
5d538f71c2 Revert r235560; this commit was causing several failed assertions in Debug builds using MSVC's STL. The iterator is being used outside of its valid range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235597 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 13:41:59 +00:00
Hans Wennborg
395f4f4b2a Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
This is a re-commit of r235101, which also fixes the problems with the previous patch:

- Switches with only a default case and non-fallthrough were handled incorrectly

- The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here.

> This is a major rewrite of the SelectionDAG switch lowering. The previous code
> would lower switches as a binary tre, discovering clusters of cases
> suitable for lowering by jump tables or bit tests as it went along. To increase
> the likelihood of finding jump tables, the binary tree pivot was selected to
> maximize case density on both sides of the pivot.
>
> By not selecting the pivot in the middle, the binary trees would not always
> be balanced, leading to performance problems in the generated code.
>
> This patch rewrites the lowering to search for clusters of cases
> suitable for jump tables or bit tests first, and then builds the binary
> tree around those clusters. This way, the binary tree will always be balanced.
>
> This has the added benefit of decoupling the different aspects of the lowering:
> tree building and jump table or bit tests finding are now easier to tweak
> separately.
>
> For example, this will enable us to balance the tree based on profile info
> in the future.
>
> The algorithm for finding jump tables is quadratic, whereas the previous algorithm
> was O(n log n) for common cases, and quadratic only in the worst-case. This
> doesn't seem to be major problem in practice, e.g. compiling a file consisting
> of a 10k-case switch was only 30% slower, and such large switches should be rare
> in practice. Compiling e.g. gcc.c showed no compile-time difference.  If this
> does turn out to be a problem, we could limit the search space of the algorithm.
>
> This commit also disables all optimizations during switch lowering in -O0.
>
> Differential Revision: http://reviews.llvm.org/D8649

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235560 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 23:14:56 +00:00
Pirama Arumuga Nainar
1aebbfac0a Fix flakiness in fp16-promote.ll
Summary:
In the f16-promote test, make the checks for native conversion instructions
similar to the libcall checks:
- Remove hard coded register names
- Do not check exact instruction sequences.

This fixes test flakiness due to non-determinism in instruction
scheduling and register allocation.  I also fixed a few minor things in
the CHECK-LIBCALL checks.

I'll try to find a way to check that unnecessary loads, stores, or
conversions don't happen.

Reviewers: mzolotukhin, srhines, ab

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9112

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235363 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 23:54:41 +00:00
Ahmed Bougacha
35e8055393 [GlobalMerge] Look at uses to create smaller global sets.
Instead of merging everything together, look at the users of
GlobalVariables, and try to group them by function, to create
sets of globals used "together".

Using that information, a less-aggressive alternative is to keep merging
everything together *except* globals that are only ever used alone, that
is, those for which it's clearly non-profitable to merge with others.

In my testing, grouping by Function is too aggressive, but grouping by
BasicBlock is too conservative.  Anything in-between isn't trivially
available, so stick with Function grouping for now.

cl::opts are added for testing; both enabled by default.

A few of the testcases aren't testing the merging proper, but just
various edge cases when merging does occur.  Update them to use the
previous grouping behavior. Also, one of the tests is unrelated to
GlobalMerge; change it accordingly.
While there, switch to r234666' flags rather than the brutal -O3.

Differential Revision: http://reviews.llvm.org/D8070


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235249 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-18 01:21:58 +00:00
Pirama Arumuga Nainar
5c1c08cd1f Add support to promote f16 to f32
Summary:
This patch adds legalization support to operate on FP16 as a load/store type
and do operations on it as floats.

Tests for ARM are added to test/CodeGen/ARM/fp16-promote.ll

Reviewers: srhines, t.p.northover

Differential Revision: http://reviews.llvm.org/D8755

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235215 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 18:36:25 +00:00
David Blaikie
32b845d223 [opaque pointer type] Add textual IR support for explicit type parameter to the call instruction
See r230786 and r230794 for similar changes to gep and load
respectively.

Call is a bit different because it often doesn't have a single explicit
type - usually the type is deduced from the arguments, and just the
return type is explicit. In those cases there's no need to change the
IR.

When that's not the case, the IR usually contains the pointer type of
the first operand - but since typed pointers are going away, that
representation is insufficient so I'm just stripping the "pointerness"
of the explicit type away.

This does make the IR a bit weird - it /sort of/ reads like the type of
the first operand: "call void () %x(" but %x is actually of type "void
()*" and will eventually be just of type "ptr". But this seems not too
bad and I don't think it would benefit from repeating the type
("void (), void () * %x(" and then eventually "void (), ptr %x(") as has
been done with gep and load.

This also has a side benefit: since the explicit type is no longer a
pointer, there's no ambiguity between an explicit type and a function
that returns a function pointer. Previously this case needed an explicit
type (eg: a function returning a void() function was written as
"call void () () * @x(" rather than "call void () * @x(" because of the
ambiguity between a function returning a pointer to a void() function
and a function returning void).

No ambiguity means even function pointer return types can just be
written alone, without writing the whole function's type.

This leaves /only/ the varargs case where the explicit type is required.

Given the special type syntax in call instructions, the regex-fu used
for migration was a bit more involved in its own unique way (as every
one of these is) so here it is. Use it in conjunction with the apply.sh
script and associated find/xargs commands I've provided in rr230786 to
migrate your out of tree tests. Do let me know if any of this doesn't
cover your cases & we can iterate on a more general script/regexes to
help others with out of tree tests.

About 9 test cases couldn't be automatically migrated - half of those
were functions returning function pointers, where I just had to manually
delete the function argument types now that we didn't need an explicit
function type there. The other half were typedefs of function types used
in calls - just had to manually drop the * from those.

import fileinput
import sys
import re

pat = re.compile(r'((?:=|:|^|\s)call\s(?:[^@]*?))(\s*$|\s*(?:(?:\[\[[a-zA-Z0-9_]+\]\]|[@%](?:(")?[\\\?@a-zA-Z0-9_.]*?(?(3)"|)|{{.*}}))(?:\(|$)|undef|inttoptr|bitcast|null|asm).*$)')
addrspace_end = re.compile(r"addrspace\(\d+\)\s*\*$")
func_end = re.compile("(?:void.*|\)\s*)\*$")

def conv(match, line):
  if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)):
    return line
  return line[:match.start()] + match.group(1)[:match.group(1).rfind('*')].rstrip() + match.group(2) + line[match.end():]

for line in sys.stdin:
  sys.stdout.write(conv(re.search(pat, line), line))

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235145 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 23:24:18 +00:00
Hans Wennborg
e2711530de Revert the switch lowering change (r235101, r235103, r235106)
Looks like it broke the sanitizer-ppc64-linux1 build. Reverting for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235108 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 15:43:26 +00:00
Hans Wennborg
cc987d98bb Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
This is a major rewrite of the SelectionDAG switch lowering. The previous code
would lower switches as a binary tre, discovering clusters of cases
suitable for lowering by jump tables or bit tests as it went along. To increase
the likelihood of finding jump tables, the binary tree pivot was selected to
maximize case density on both sides of the pivot.

By not selecting the pivot in the middle, the binary trees would not always
be balanced, leading to performance problems in the generated code.

This patch rewrites the lowering to search for clusters of cases
suitable for jump tables or bit tests first, and then builds the binary
tree around those clusters. This way, the binary tree will always be balanced.

This has the added benefit of decoupling the different aspects of the lowering:
tree building and jump table or bit tests finding are now easier to tweak
separately.

For example, this will enable us to balance the tree based on profile info
in the future.

The algorithm for finding jump tables is O(n^2), whereas the previous algorithm
was O(n log n) for common cases, and quadratic only in the worst-case. This
doesn't seem to be major problem in practice, e.g. compiling a file consisting
of a 10k-case switch was only 30% slower, and such large switches should be rare
in practice. Compiling e.g. gcc.c showed no compile-time difference.  If this
does turn out to be a problem, we could limit the search space of the algorithm.

This commit also disables all optimizations during switch lowering in -O0.

Differential Revision: http://reviews.llvm.org/D8649

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235101 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 14:49:23 +00:00
Duncan P. N. Exon Smith
88e419d66e DebugInfo: Remove 'inlinedAt:' field from MDLocalVariable
Remove 'inlinedAt:' from MDLocalVariable.  Besides saving some memory
(variables with it seem to be single largest `Metadata` contributer to
memory usage right now in -g -flto builds), this stops optimization and
backend passes from having to change local variables.

The 'inlinedAt:' field was used by the backend in two ways:

 1. To tell the backend whether and into what a variable was inlined.
 2. To create a unique id for each inlined variable.

Instead, rely on the 'inlinedAt:' field of the intrinsic's `!dbg`
attachment, and change the DWARF backend to use a typedef called
`InlinedVariable` which is `std::pair<MDLocalVariable*, MDLocation*>`.
This `DebugLoc` is already passed reliably through the backend (as
verified by r234021).

This commit removes the check from r234021, but I added a new check
(that will survive) in r235048, and changed the `DIBuilder` API in
r235041 to require a `!dbg` attachment whose 'scope:` is in the same
`MDSubprogram` as the variable's.

If this breaks your out-of-tree testcases, perhaps the script I used
(mdlocalvariable-drop-inlinedat.sh) will help; I'll attach it to PR22778
in a moment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235050 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-15 22:29:27 +00:00
Duncan P. N. Exon Smith
666ef776b3 DebugInfo: Add missing !dbg attachments to intrinsics
Add missing `!dbg` attachments to `@llvm.dbg.*` intrinsics.  I updated
these using a script (add-dbg-to-intrinsics.sh) that I'll attach to
PR22778 for posterity.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235040 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-15 21:04:10 +00:00
Rafael Espindola
f194367792 Update tests to not be as dependent on section numbers.
Many of these predate llvm-readobj. With elf-dump we had to match
a relocation to symbol number and symbol number to symbol name or
section number.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235015 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-15 15:59:37 +00:00
Daniel Jasper
058309ba87 Re-apply r234898 and fix tests.
This commit makes LLVM not estimate branch probabilities when doing a
single bit bitmask tests.

The code that originally made me discover this is:

  if ((a & 0x1) == 0x1) {
    ..
  }

In this case we don't actually have any branch probability information
and should not assume to have any. LLVM transforms this into:

  %and = and i32 %a, 1
  %tobool = icmp eq i32 %and, 0

So, in this case, the result of a bitwise and is compared against 0,
but nevertheless, we should not assume to have probability
information.

CodeGen/ARM/2013-10-11-select-stalls.ll started failing because the
changed probabilities changed the results of
ARMBaseInstrInfo::isProfitableToIfCvt() and led to an Ifcvt of the
diamond in the test. AFAICT, the test was never meant to test this and
thus changing the test input slightly to not change the probabilities
seems like the best way to preserve the meaning of the test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234979 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-15 06:24:07 +00:00
Krzysztof Parzyszek
b6852d12a8 Remove this test until I figure out why it fails
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234777 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 18:57:50 +00:00
Krzysztof Parzyszek
75f1ab4b1b Make the ARM testcase from r234764 also pass on Thumb
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234772 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 18:39:52 +00:00
Krzysztof Parzyszek
fcc330abfe Allow memory intrinsics to be tail calls
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234764 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 17:16:45 +00:00
John Brawn
afa4193fc8 [ARM] Align global variables passed to memory intrinsics
Fill in the TODO in CodeGenPrepare::OptimizeCallInst so that global
variables that are passed to memory intrinsics are aligned in the same
way that allocas are.

Differential Revision: http://reviews.llvm.org/D8421


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234735 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 10:47:39 +00:00
Ahmed Bougacha
d2069333ee [CodeGen] Split -enable-global-merge into ARM and AArch64 options.
Currently, there's a single flag, checked by the pass itself.
It can't force-enable the pass (and is on by default), because it
might not even have been created, as that's the targets decision.
Instead, have separate explicit flags, so that the decision is
consistently made in the target.

Keep the flag as a last-resort "force-disable GlobalMerge" for now,
for backwards compatibility.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234666 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-11 00:06:36 +00:00
Javed Absar
28c2fda9df [ARM] support for Cortex-R4/R4F
Currently, llvm (backend) doesn't know cortex-r4, even though it is the
default target for armv7r. Using "--target=armv7r-arm-none-eabi" provokes
'cortex-r4' is not a recognized processor for this target' by llvm.
This patch adds support for cortex-r4 and, very closely related, r4f.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234486 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 14:07:28 +00:00
Scott Douglass
7e2bc24e05 [ARM] make vminnm/vmaxnm work with ?le, ?ge and no-nans-fp-math
Because -menable-no-nans causes fcmp conditions to be rewritten
without 'o' or 'u' the recognition code in needs to cope. Also
extended it to handle 'le' and 'ge.

Differential Revision: http://reviews.llvm.org/D8725


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234421 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 17:18:28 +00:00
Sergey Dmitrouk
a7512d1d4a [ARM][Debug Info] Restore emitting of .cfi_def_cfa_offset for functions without stack frame
Summary: Looks like new code from [[ http://reviews.llvm.org/rL222057 | rL222057 ]] doesn't account for early `return` in `ARMFrameLowering::emitPrologue`, which leads to loosing `.cfi_def_cfa_offset` directive for functions without stack frame.

Reviewers: echristo, rengolin, asl, t.p.northover

Reviewed By: t.p.northover

Subscribers: llvm-commits, rengolin, aemerson

Differential Revision: http://reviews.llvm.org/D8606

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234399 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 10:10:12 +00:00
Duncan P. N. Exon Smith
b0d0a65e1d Verifier: Check that inlined-at locations agree
Check that the `MDLocalVariable::getInlinedAt()` in a debug info
intrinsic's variable always matches the `MDLocation::getInlinedAt()` of
its `!dbg` attachment.

The goal here is to get rid of `MDLocalVariable::getInlinedAt()`
entirely (PR22778), since it's expensive and unnecessary, but I'll let
this verifier check bake for a while (a week maybe?) first.  I've
updated the testcases that had the wrong value for `inlinedAt:`.

This checks that things are sane in the IR, but currently things go out
of whack in a few places in the backend.  I'll follow shortly with
assertions in the backend (with code fixes).

If you have out-of-tree testcases that just started failing, here's how
I updated these ones:

 1. The verifier check gives you the basic block, function, instruction,
    and relevant metadata arguments (metadata numbering doesn't
    necessarily match the source file, unfortunately).
 2. Look at the `@llvm.dbg.*()` instruction, and compare the
    `inlinedAt:` fields of the variable argument (second `metadata`
    argument) and the `!dbg` attachment.
 3. Figure out based on the variable `scope:` chain and the functions in
    the file whether the variable has been inlined (and into what), so
    you can determine which `inlinedAt:` is actually correct.  In all of
    the in-tree testcases, the `!MDLocation()` was correct and the
    `!MDLocalVariable()` was wrong, but YMMV.
 4. Duplicate the metadata that you're going to change, and add/drop the
    `inlinedAt:` field from one of them.  Be careful that the other
    references to the same metadata node point at the correct one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234021 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 16:54:30 +00:00
Matthias Braun
67a2b82b14 ARM: Handle physreg targets in RegPair hints gracefully
Register coalescing can change the target of a RegPair hint to a
physreg, we should not crash on this. This also slightly improved the
way ARMBaseRegisterInfo::updateRegAllocHint() works.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233987 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 00:18:38 +00:00
James Molloy
369ee1b1f4 [SDAG] Move TRUNCATE splitting logic into a helper, and use
it more liberally.

SplitVecOp_TRUNCATE has logic for recursively splitting oversize vectors
that need more than one round of splitting to become legal. There are many
other ISD nodes that could benefit from this logic, so factor it out and
use it for FP_TO_UINT,FP_TO_SINT,SINT_TO_FP,UINT_TO_FP and FTRUNC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233681 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 10:20:58 +00:00
Akira Hatanaka
57e9efecb0 [ARM] Enable changing instprinter's behavior based on the per-function
subtarget.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233451 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 23:41:42 +00:00
Duncan P. N. Exon Smith
77a0728d96 DebugInfo: Fix bad debug info for compile units and types
Fix debug info in these tests, which started failing with a WIP patch to
verify compile units and types.  The problems look like they were all
caused by bitrot.  They fell into these categories:

  - Using `!{i32 0}` instead of `!{}`.
  - Using `!{null}` instead of `!{}`.
  - Using `!MDExpression()` instead of `!{}`.
  - Using `!8` instead of `!{!8}`.
  - `file:` references that pointed at `MDCompileUnit`s instead of the
    same `MDFile` as the compile unit.
  - `file:` references that were numerically off-by-one or (off-by-ten).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233415 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 20:46:33 +00:00
Andrew Trick
9217916725 Complete the MachineScheduler fix made way back in r210390.
"Fix the MachineScheduler's logic for updating ready times for in-order.
 Now the scheduler updates a node's ready time as soon as it is
 scheduled, before releasing dependent nodes."

This fix was only made in one variant of the ScheduleDAGMI driver.
Francois de Ferriere reported the issue in the other bit of code where
it was also needed.
I never got around to coming up with a test case, but it's an
obvious fix that shouldn't be delayed any longer.
I'll try to refactor this code a little better.

I did verify performance on a wide variety of targets and saw no
negative impact with this fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233366 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 06:10:13 +00:00
Derek Schuff
32b33c2c30 Use movw/movt instead of constant pool loads to lower byval parameter copies
Summary:
The ARM backend can use a loop to implement copying byval parameters before
a call. In non-thumb2 mode it uses a constant pool load to materialize the
trip count. For targets that need movt instead (e.g. Native Client), use
the same code as in thumb2 mode to materialize the trip count.

Reviewers: jfb, t.p.northover

Differential Revision: http://reviews.llvm.org/D8442

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233324 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 22:11:00 +00:00
Vladimir Sukharev
c8a807c1c5 [ARM] Add v8.1a "Rounding Double Multiply Add/Subtract" extension
Reviewers: t.p.northover

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8503


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233301 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 18:29:02 +00:00
Vladimir Sukharev
27d12f3e6e [AArch64, ARM] Add v8.1a architecture and generic cpu
New architecture and cpu added, following http://community.arm.com/groups/processors/blog/2014/12/02/the-armv8-a-architecture-and-its-ongoing-development

Reviewers: t.p.northover

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8505


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233290 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 17:05:54 +00:00
Ahmed Bougacha
c9ad3ab624 [AArch64, ARM] Enable GlobalMerge with -O3 rather than -O1.
The pass used to be enabled by default with CodeGenOpt::Less (-O1).
This is too aggressive, considering the pass indiscriminately merges
all globals together.

Currently, performance doesn't always improve, and, on code that uses
few globals (e.g., the odd file- or function- static), more often than
not is degraded by the optimization.  Lengthy discussion can be found
on llvmdev (AArch64-focused;  ARM has similar problems):
  http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-February/082800.html
Also, it makes tooling and debuggers less useful when dealing with
globals and data sections.

GlobalMerge needs to better identify those cases that benefit, and this
will be done separately.  In the meantime, move the pass to run with
-O3 rather than -O1, on both ARM and AArch64.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233024 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 21:17:36 +00:00
Bradley Smith
a75fecc370 Revert "[ARM] Add more pattern matching for f16 <-> f64 conversions"
This change is incorrect since it converts double rounding into single rounding,
which can produce different results. Instead this optimization will be done by
modifying Clang's codegen to not produce double rounding in the first place.

This reverts commit r232954.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232962 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 16:52:52 +00:00
Bradley Smith
de5be4657f [ARM] Add more pattern matching for f16 <-> f64 conversions
Specifically when the conversion is done in two steps, f16 -> f32 -> f64.

For example:

%1 = tail call float @llvm.convert.from.fp16.f32(i16 %0)
%conv = fpext float %1 to double

to:

vcvtb.f64.f16


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232954 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 15:59:54 +00:00
Ahmed Bougacha
995f4f8fd1 [CodeGen][IfCvt] Don't re-ifcvt blocks with unanalyzable terminators.
If we couldn't analyze its terminator (i.e., it's an indirectbr, or some
other weirdness), we can't safely re-if-convert a predicated block,
because we can't tell whether the predicated terminator can
fallthrough (it does).

Currently, we would completely ignore the fallthrough successor. In
the added testcase, this means we used to generate:

    ...
  @ %entry:
    cmp   r5, #21
    ittt  ne
  @ %cc1f:
    cmpne r7, #42
  @ %cc2t:
    strne.w       r5, [r8]
    movne pc, r10
  @ %cc1t:
    ...

Whereas the successor of %cc1f was originally %bb1.
With the fix, we get the correct:

    ...
  @ %entry:
    cmp   r5, #21
    itt   eq
  @ %cc1t:
    streq.w       r5, [r11]
    moveq pc, r0
  @ %cc1f:
    cmp   r7, #42
    itt   ne
  @ %cc2t:
    strne.w       r5, [r8]
    movne pc, r10
  @ %bb1:
    ...

rdar://20192768
Differential Revision: http://reviews.llvm.org/D8509


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232872 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 01:23:15 +00:00
Rafael Espindola
d80979b25d Don't declare all text sections at the start of the .s
The code this patch removes was there to make sure the text sections went
before the dwarf sections. That is necessary because MachO uses offsets
relative to the start of the file, so adding a section can change relaxations.

The dwarf sections were being printed at the start just to produce symbols
pointing at the start of those sections.

The underlying issue was fixed in r231898. The dwarf sections are now printed
when they are about to be used, which is after we printed the text sections.

To make sure we don't regress, the patch makes the MachO streamer assert
if CodeGen puts anything unexpected after the DWARF sections.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232842 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-20 20:00:01 +00:00
Owen Anderson
8154ef7589 Fix a nasty bug in DAGCombine of STORE nodes.
This is very related to the bug fixed in r174431.  The problem is that
SelectionDAG does not include alignment in the uniquing of loads and
stores.  When an otherwise no-op DAGCombine would increase the alignment
of a load or store, the original node would be returned (with the
alignment increased), which would cause the node not to be processed by
any further DAGCombines.

I don't have a direct testcase for this that manifests on an in-tree
target, but I did see some noise in the tests for other targets and have
updated them for it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232780 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-19 22:48:57 +00:00
John Brawn
0328ca6cd7 [ARM] Align stack objects passed to memory intrinsics
Memcpy, and other memory intrinsics, typically tries to use LDM/STM if
the source and target addresses are 4-byte aligned. In CodeGenPrepare
look for calls to memory intrinsics and, if the object is on the
stack, 4-byte align it if it's large enough that we expect that memcpy
would want to use LDM/STM to copy it.

Differential Revision: http://reviews.llvm.org/D7908


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232627 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-18 12:01:59 +00:00
David Majnemer
7605cdd6e4 COFF: Let globals with private linkage reside in their own section
COFF COMDATs (for selection kinds other than 'select any') require at
least one non-section symbol in the symbol table.
Satisfy this by morally enhancing the linkage from private to internal.

Differential Revision: http://reviews.llvm.org/D8394

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232570 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 23:54:51 +00:00
Richard Barton
b59aee170f [ARM] Fix offset calculation in ARMBaseRegisterInfo::needsFrameBaseReg
The input offset to needsFrameBaseReg is a negative value below the top of the
stack frame, but when converting to a positive offset from the bottom of the
stack frame this value was negated, causing the final offset to be too large
by twice the input offset's magnitude. Fix that by not negating the offset.

Patch by John Brawn

Differential Revision: http://reviews.llvm.org/D8316

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232513 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 18:20:47 +00:00
Rafael Espindola
a480f88b3c Move the EH symbol to the asm printer and use it for the SJLJ case too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232475 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 13:57:48 +00:00
Renato Golin
ce1f16421f [ARM] Add support for ARMV6K subtarget (LLVM)
ARMv6K is another layer between ARMV6 and ARMV6T2. This is the LLVM
side of the changes.

ARMV6 family LLVM implementation.

+-------------------------------------+
| ARMV6                               |
+----------------+--------------------+
| ARMV6M (thumb) | ARMV6K (arm,thumb) | <- From ARMV6K and ARMV6M processors
+----------------+--------------------+    have support for hint instructions
| ARMV6T2 (arm,thumb,thumb2)          |    (SEV/WFE/WFI/NOP/YIELD). They can
+-------------------------------------+    be either real or default to NOP.
| ARMV7 (arm,thumb,thumb2)            |    The two processors also use
+-------------------------------------+    different encoding for them.

Patch by Vinicius Tinti.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232468 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-17 11:55:28 +00:00
David Blaikie
5a70dd1d82 [opaque pointer type] Add textual IR support for explicit type parameter to gep operator
Similar to gep (r230786) and load (r230794) changes.

Similar migration script can be used to update test cases, which
successfully migrated all of LLVM and Polly, but about 4 test cases
needed manually changes in Clang.

(this script will read the contents of stdin and massage it into stdout
- wrap it in the 'apply.sh' script shown in previous commits + xargs to
apply it over a large set of test cases)

import fileinput
import sys
import re

rep = re.compile(r"(getelementptr(?:\s+inbounds)?\s*\()((<\d*\s+x\s+)?([^@]*?)(|\s*addrspace\(\d+\))\s*\*(?(3)>)\s*)(?=$|%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|zeroinitializer|<|\[\[[a-zA-Z]|\{\{)", re.MULTILINE | re.DOTALL)

def conv(match):
  line = match.group(1)
  line += match.group(4)
  line += ", "
  line += match.group(2)
  return line

line = sys.stdin.read()
off = 0
for match in re.finditer(rep, line):
  sys.stdout.write(line[off:match.start()])
  sys.stdout.write(conv(match))
  off = match.end()
sys.stdout.write(line[off:])

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232184 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-13 18:20:45 +00:00
Tim Northover
52f83a9ab3 ARM: simplify and extend byval handling
The main issue being fixed here is that APCS targets handling a "byval align N"
parameter with N > 4 were miscounting what objects were where on the stack,
leading to FrameLowering setting the frame pointer incorrectly and clobbering
the stack.

But byval handling had grown over many years, and had multiple layers of cruft
trying to compensate for each other and calculate padding correctly. This only
really needs to be done once, in the HandleByVal function. Elsewhere should
just do what it's told by that call.

I also stripped out unnecessary APCS/AAPCS distinctions (now that Clang emits
byvals with the correct C ABI alignment), which simplified HandleByVal.

rdar://20095672

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231959 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-11 18:54:22 +00:00
Eric Christopher
80d55e6bcb Remove use of misched-bench from this test and replace it with
non-temporary enabling options. This is part of removing misched-bench
as an option.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231546 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-07 01:39:06 +00:00
Eric Christopher
5dc2251b4e Recommit r231324 with a fix to the ARM execution domain code
to disable lane switching if we don't actually have the instruction
set we want to switch to. Models the earlier check above the
conditional for the pass.

The testcase is one that triggered with the assert that's added
as part of the fix, use it to avoid adding a new testcase as it
highlights the same problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231539 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-07 00:12:22 +00:00
Matthias Braun
47941aa098 DAGCombiner: Canonicalize select(and/or,x,y) depending on target.
This is based on the following equivalences:
select(C0 & C1, X, Y) <=> select(C0, select(C1, X, Y), Y)
select(C0 | C1, X, Y) <=> select(C0, X, select(C1, X, Y))

Many target cannot perform and/or on the CPU flags and therefore the
right side should be choosen to avoid materializign the i1 flags in an
integer register. If the target can perform this operation efficiently
we normalize to the left form.

Differential Revision: http://reviews.llvm.org/D7622

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231507 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-06 19:49:10 +00:00
Ahmed Bougacha
67297cd956 [ARM] Enable vector extload combine for legal types.
This commit enables forming vector extloads for ARM.
It only does so for legal types, and when we can't fold the extension
in a wide/long form of the user instruction.

Enabling it for larger types isn't as good an idea on ARM as it is on
X86, because: 
- we pretend that extloads are legal, but end up generating vld+vmov
- we have instructions like vld {dN, dM}, which can't be generated
  when we "manually expand" extloads to vld+vmov.

For legal types, the combine doesn't fire that often: in the
integration tests only in a big endian testcase, where it removes a
pointless AND.

Related to rdar://19723053
Differential Revision: http://reviews.llvm.org/D7423


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231396 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-05 19:37:53 +00:00
Matthias Braun
29aeaf5408 Improve test robustness
Improve test robustness in preparation of coming commits:
- Avoid undefs which may get propagated too much.
- Remove several pointless add 0, instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231307 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 22:31:18 +00:00
Adrian Prantl
2e74ddea3a Update the out-of-date dwarf expressions in these testcases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231261 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-04 17:39:59 +00:00
Duncan P. N. Exon Smith
b056aa798d DebugInfo: Move new hierarchy into place
Move the specialized metadata nodes for the new debug info hierarchy
into place, finishing off PR22464.  I've done bootstraps (and all that)
and I'm confident this commit is NFC as far as DWARF output is
concerned.  Let me know if I'm wrong :).

The code changes are fairly mechanical:

  - Bumped the "Debug Info Version".
  - `DIBuilder` now creates the appropriate subclass of `MDNode`.
  - Subclasses of DIDescriptor now expect to hold their "MD"
    counterparts (e.g., `DIBasicType` expects `MDBasicType`).
  - Deleted a ton of dead code in `AsmWriter.cpp` and `DebugInfo.cpp`
    for printing comments.
  - Big update to LangRef to describe the nodes in the new hierarchy.
    Feel free to make it better.

Testcase changes are enormous.  There's an accompanying clang commit on
its way.

If you have out-of-tree debug info testcases, I just broke your build.

  - `upgrade-specialized-nodes.sh` is attached to PR22564.  I used it to
    update all the IR testcases.
  - Unfortunately I failed to find way to script the updates to CHECK
    lines, so I updated all of these by hand.  This was fairly painful,
    since the old CHECKs are difficult to reason about.  That's one of
    the benefits of the new hierarchy.

This work isn't quite finished, BTW.  The `DIDescriptor` subclasses are
almost empty wrappers, but not quite: they still have loose casting
checks (see the `RETURN_FROM_RAW()` macro).  Once they're completely
gutted, I'll rename the "MD" classes to "DI" and kill the wrappers.  I
also expect to make a few schema changes now that it's easier to reason
about everything.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231082 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-03 17:24:31 +00:00
Adrian Prantl
994176ad7c Refactor DebugLocDWARFExpression so it doesn't require access to the
TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes
of the pre-calculated DWARF expression.

Ought to be NFC, but it does slightly alter the output format of the
textual assembly.

This reapplies 230930 without the assertion in DebugLocEntry::finalize()
because not all Machine registers can be lowered into DWARF register
numbers and floating point constants cannot be expressed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231023 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-02 22:02:33 +00:00
Adrian Prantl
a2e69c9c58 Revert "Refactor DebugLocDWARFExpression so it doesn't require access to the"
This reverts commit 230975 to investigate buildbot breakage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231004 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-02 20:01:54 +00:00
Adrian Prantl
9680c9c1a8 Refactor DebugLocDWARFExpression so it doesn't require access to the
TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes
of the pre-calculated DWARF expression.

Ought to be NFC, but it does slightly alter the output format of the
textual assembly.

This reapplies 230930 with a relaxed assertion in DebugLocEntry::finalize()
that allows for empty DWARF expressions for constant FP values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230975 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-02 17:21:06 +00:00
Nico Weber
5e871d0b9c Revert r230930, it caused PR22747.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230932 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-02 04:37:11 +00:00
Adrian Prantl
d21acaf6a1 Refactor DebugLocDWARFExpression so it doesn't require access to the
TargetRegisterInfo. DebugLocEntry now holds a buffer with the raw bytes
of the pre-calculated DWARF expression.

Ought to be NFC, but it does slightly alter the output format of the
textual assembly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230930 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-02 02:38:18 +00:00
David Blaikie
7c9c6ed761 [opaque pointer type] Add textual IR support for explicit type parameter to load instruction
Essentially the same as the GEP change in r230786.

A similar migration script can be used to update test cases, though a few more
test case improvements/changes were required this time around: (r229269-r229278)

import fileinput
import sys
import re

pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")

for line in sys.stdin:
  sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7649

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230794 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 21:17:42 +00:00
David Blaikie
198d8baafb [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
One of several parallel first steps to remove the target type of pointers,
replacing them with a single opaque pointer type.

This adds an explicit type parameter to the gep instruction so that when the
first parameter becomes an opaque pointer type, the type to gep through is
still available to the instructions.

* This doesn't modify gep operators, only instructions (operators will be
  handled separately)

* Textual IR changes only. Bitcode (including upgrade) and changing the
  in-memory representation will be in separate changes.

* geps of vectors are transformed as:
    getelementptr <4 x float*> %x, ...
  ->getelementptr float, <4 x float*> %x, ...
  Then, once the opaque pointer type is introduced, this will ultimately look
  like:
    getelementptr float, <4 x ptr> %x
  with the unambiguous interpretation that it is a vector of pointers to float.

* address spaces remain on the pointer, not the type:
    getelementptr float addrspace(1)* %x
  ->getelementptr float, float addrspace(1)* %x
  Then, eventually:
    getelementptr float, ptr addrspace(1) %x

Importantly, the massive amount of test case churn has been automated by
same crappy python code. I had to manually update a few test cases that
wouldn't fit the script's model (r228970,r229196,r229197,r229198). The
python script just massages stdin and writes the result to stdout, I
then wrapped that in a shell script to handle replacing files, then
using the usual find+xargs to migrate all the files.

update.py:
import fileinput
import sys
import re

ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
normrep = re.compile(       r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")

def conv(match, line):
  if not match:
    return line
  line = match.groups()[0]
  if len(match.groups()[5]) == 0:
    line += match.groups()[2]
  line += match.groups()[3]
  line += ", "
  line += match.groups()[1]
  line += "\n"
  return line

for line in sys.stdin:
  if line.find("getelementptr ") == line.find("getelementptr inbounds"):
    if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("):
      line = conv(re.match(ibrep, line), line)
  elif line.find("getelementptr ") != line.find("getelementptr ("):
    line = conv(re.match(normrep, line), line)
  sys.stdout.write(line)

apply.sh:
for name in "$@"
do
  python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
  rm -f "$name.tmp"
done

The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh

After that, check-all (with llvm, clang, clang-tools-extra, lld,
compiler-rt, and polly all checked out).

The extra 'rm' in the apply.sh script is due to a few files in clang's test
suite using interesting unicode stuff that my python script was throwing
exceptions on. None of those files needed to be migrated, so it seemed
sufficient to ignore those cases.

Reviewers: rafael, dexonsmith, grosser

Differential Revision: http://reviews.llvm.org/D7636

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230786 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 19:29:02 +00:00
Mehdi Amini
26d628d6ce Change the fast-isel-abort option from bool to int to enable "levels"
Summary:
Currently fast-isel-abort will only abort for regular instructions,
and just warn for function calls, terminators, function arguments.
There is already fast-isel-abort-args but nothing for calls and
terminators.

This change turns the fast-isel-abort options into an integer option,
so that multiple levels of strictness can be defined.
This will help no being surprised when the "abort" option indeed does
not abort, and enables the possibility to write test that verifies
that no intrinsics are forgotten by fast-isel.

Reviewers: resistor, echristo

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D7941

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230775 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 18:32:11 +00:00
Renato Golin
636aacf211 Equally to NetBSD, Bitrig/ARM uses the Itanium-ABI.
Patch by Patrick Wildt.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230762 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 16:35:27 +00:00
Petar Jovanovic
8407da0dbc Pass correct -mtriple for krait-cpu-div-attribute.ll
Not passing mtriple for one of the tests caused a regression failure
on MIPS buildbot. The issue was introduced by r230651.

Differential Revision: http://reviews.llvm.org/D7938


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230756 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-27 14:46:41 +00:00
Sumanth Gundapaneni
adaebc8b56 Use ".arch_extension" ARM directive to support hwdiv on krait
In case of "krait" CPU, asm printer doesn't emit any ".cpu" so the
features bits are not computed. This patch lets the asm printer
emit ".cpu cortex-a9" directive for krait and the hwdiv feature is
enabled through ".arch_extension". In short, krait is treated
as "cortex-a9" with hwdiv. We can not emit ".krait" as CPU since
it is not supported bu GNU GAS yet


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230651 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-26 18:08:41 +00:00
Renato Golin
b451f4e376 Improve handling of stack accesses in Thumb-1
Thumb-1 only allows SP-based LDR and STR to be word-sized, and SP-base LDR,
STR, and ADD only allow offsets that are a multiple of 4. Make some changes
to better make use of these instructions:

* Use word loads for anyext byte and halfword loads from the stack.
* Enforce 4-byte alignment on objects accessed in this way, to ensure that
  the offset is valid.
* Do the same for objects whose frame index is used, in order to avoid having
  to use more than one ADD to generate the frame index.
* Correct how many bits of offset we think AddrModeT1_s has.

Patch by John Brawn.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230496 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-25 14:41:06 +00:00
Simon Pilgrim
0115755020 Added test case for PR22678 (check CONCAT_VECTORS DAG combiner pass doesn't introduce illegal types)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230386 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-24 21:46:23 +00:00
Tim Northover
5530ac99e6 ARM: treat [N x i32] and [N x i64] as AAPCS composite types
The logic is almost there already, with our special homogeneous aggregate
handling. Tweaking it like this allows front-ends to emit AAPCS compliant code
without ever having to count registers or add discarded padding arguments.

Only arrays of i32 and i64 are needed to model AAPCS rules, but I decided to
apply the logic to all integer arrays for more consistency.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230348 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-24 17:22:34 +00:00
Ahmed Bougacha
5898fc70ec [ARM] Re-re-apply VLD1/VST1 base-update combine.
This re-applies r223862, r224198, r224203, and r224754, which were
reverted in r228129 because they exposed Clang misalignment problems
when self-hosting.

The combine caused the crashes because we turned ISD::LOAD/STORE nodes
to ARMISD::VLD1/VST1_UPD nodes.  When selecting addressing modes, we
were very lax for the former, and only emitted the alignment operand
(as in "[r1:128]") when it was larger than the standard alignment of
the memory type.

However, for ARMISD nodes, we just used the MMO alignment, no matter
what.  In our case, we turned ISD nodes to ARMISD nodes, and this
caused the alignment operands to start being emitted.

And that's how we exposed alignment problems that were ignored before
(but I believe would have been caught with SCTRL.A==1?).

To fix this, we can just mirror the hack done for ISD nodes:  only
take into account the MMO alignment when the access is overaligned.

Original commit message:
We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD
when the base pointer is incremented after the load/store.

We can do the same thing for generic load/stores.

Note that we can only combine the first load/store+adds pair in
a sequence (as might be generated for a v16f32 load for instance),
because other combines turn the base pointer addition chain (each
computing the address of the next load, from the address of the last
load) into independent additions (common base pointer + this load's
offset).

rdar://19717869, rdar://14062261.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229932 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-19 23:52:41 +00:00
Peter Collingbourne
7d3b145da4 llvm-mc: Use Target::createNullStreamer to fix crashes on target-specific asm directives.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229798 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-19 00:45:04 +00:00
Bradley Smith
4fe6d075d5 [ARM] Add missing M/R class CPUs
Add some of the missing M and R class Cortex CPUs, namely:

Cortex-M0+ (called Cortex-M0plus for GCC compatibility)
Cortex-M1
SC000
SC300
Cortex-R5


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229660 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-18 10:33:30 +00:00
Sanjay Patel
1115b2c27e Canonicalize splats as build_vectors (PR22283)
This is a follow-on patch to:
http://reviews.llvm.org/D7093

That patch canonicalized constant splats as build_vectors, 
and this patch removes the constant check so we can canonicalize
all splats as build_vectors.

This fixes the 2nd test case in PR22283:
http://llvm.org/bugs/show_bug.cgi?id=22283

The unfortunate code duplication between SelectionDAG and DAGCombiner
is discussed in the earlier patch review. At least this patch is just
removing code...

This improves an existing x86 AVX test and changes codegen in an ARM test.

Differential Revision: http://reviews.llvm.org/D7389


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229511 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-17 16:54:32 +00:00
James Molloy
4de471dd0a [SimplifyCFG] Swap to using TargetTransformInfo for cost
analysis.

We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness"
heuristic and use TTI directly. Generally NFC intended, but we're using a slightly
different heuristic now so there is a slight test churn.

Test changes:
  * combine-comparisons-by-cse.ll: Removed unneeded branch check.
  * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq.
  * coalesce-subregs.ll: Superfluous block check.
  * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv.
  * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present.
  * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228826 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-11 12:15:41 +00:00
Bradley Smith
cec93b661d [ARM] Add armv6s[-]m as an alias to armv6[-]m
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228696 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-10 15:15:08 +00:00
Akira Hatanaka
522cf235e9 Fix a bug in DemoteRegToStack where a reload instruction was inserted into the
wrong basic block.

This would happen when the result of an invoke was used by a phi instruction
in the invoke's normal destination block. An instruction to reload the invoke's
value would get inserted before the critical edge was split and a new basic
block (which is the correct insertion point for the reload) was created. This
commit fixes the bug by splitting the critical edge before all the reload
instructions are inserted.

Also, hoist up the code which computes the insertion point to the only place
that need that computation.

rdar://problem/15978721


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228566 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-09 06:38:23 +00:00
Tim Northover
848d812931 ARM & AArch64: teach LowerVSETCC that output type size may differ from input.
While various DAG combines try to guarantee that a vector SETCC
operation will have the same output size as input, there's nothing
intrinsic to either creation or LegalizeTypes that actually guarantees
it, so the function needs to be ready to handle a mismatch.

Fortunately this is easy enough, just extend or truncate the naturally
compared result.

I couldn't reproduce the failure in other backends that I know have
SIMD, so it's probably only an issue for these two due to shared
heritage.

Should fix PR21645.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228518 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-08 00:50:47 +00:00
David Majnemer
fdac306a12 MC: Emit COFF section flags in the "proper" order
COFF section flags are not idempotent:
  'rd' will make a read-write section because 'd' implies write
  'dr' will make a read-only section because 'r' disables write

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228490 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-07 08:26:40 +00:00
Ahmed Bougacha
a7f2cf45f3 [ARM] Use patterns instead of hardcoded regs in test. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228259 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-05 01:52:19 +00:00
Ahmed Bougacha
42ec3433ef [ARM] Make testcase more explicit. NFC.
The q8/d16 thing is silly;  I'd be happy to hear about a better
way to write those tests where simple substitution isn't enough..


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228258 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-05 01:45:28 +00:00
Renato Golin
0966a4e370 Adding support to LLVM for targeting Cortex-A72
Currently, Cortex-A72 is modelled as an Cortex-A57 except the fp
load balancing pass isn't enabled for Cortex-A72 as it's not
profitable to have it enabled for this core.

Patch by Ranjeet Singh.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228140 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-04 13:31:29 +00:00
Renato Golin
ff01f89466 Reverting VLD1/VST1 base-updating/post-incrementing combining
This reverts patches 223862, 224198, 224203, and 224754, which were all
related to the vector load/store combining and were reverted/reaplied
a few times due to the same alignment problems we're seeing now.

Further tests, mainly self-hosting Clang, will be needed to reapply this
patch in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228129 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-04 10:11:59 +00:00
Jan Wen Voung
1a63641597 Fix ARM peephole optimizeCompare to avoid optimizing unsigned cmp to 0.
Summary:
Previously it only avoided optimizing signed comparisons to 0.
Sometimes the DAGCombiner will optimize the unsigned comparisons
to 0 before it gets to the peephole pass, but sometimes it doesn't.

Fix for PR22373.

Test Plan: test/CodeGen/ARM/sub-cmp-peephole.ll

Reviewers: jfb, manmanren

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D7274

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227809 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-02 16:56:50 +00:00
Saleem Abdulrasool
81674807cc ARM: support stack probe size on Windows on ARM
Now that -mstack-probe-size is piped through to the backend via the function
attribute as on Windows x86, honour the value to permit handling of non-default
values for stack probes.  This is needed /Gs with the clang-cl driver or
-mstack-probe-size with the clang driver when targeting Windows on ARM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227667 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-31 02:26:37 +00:00
Charlie Turner
1a8618cbbf Add a missing Tag_DIV_use test for Cortex-M7.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227429 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-29 11:19:54 +00:00
Jyoti Allur
245caec9b3 This patch fixes issue with lowering below mentioned pattern :-
_foo:
        smull	 r0, r1, r1, r0
	smull	 r2, r3, r3, r2
	adds	r0, r2, r0
	adc	r1, r3, r1
	bx	lr

to

_foo:
        smull	 r0, r1, r1, r0
	smlal	 r0, r1, r3, r2
	bx	lr


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226904 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-23 09:10:03 +00:00
Jonathan Roelofs
cab5680f6c Fix load-store optimizer on thumbv4t
Thumbv4t does not have lo->lo copies other than MOVS,
and that can't be predicated. So emit MOVS when needed
and bail if there's a predicate.

http://reviews.llvm.org/D6592


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226711 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-21 22:39:43 +00:00
Rafael Espindola
4b678bff4e Bring r226038 back.
No change in this commit, but clang was changed to also produce trivial comdats when
needed.

Original message:

Don't create new comdats in CodeGen.

This patch stops the implicit creation of comdats during codegen.

Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226467 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-19 15:16:06 +00:00
Timur Iskhodzhanov
6a7c74de33 Revert r226242 - Revert Revert Don't create new comdats in CodeGen
This breaks AddressSanitizer (ninja check-asan) on Windows

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226251 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-16 08:38:45 +00:00
Rafael Espindola
dfe88a08c7 Revert "Revert Don't create new comdats in CodeGen"
This reverts commit r226173, adding r226038 back.

No change in this commit, but clang was changed to also produce trivial comdats for
costructors, destructors and vtables when needed.

Original message:

Don't create new comdats in CodeGen.

This patch stops the implicit creation of comdats during codegen.

Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226242 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-16 02:22:55 +00:00
Hal Finkel
39b09ae788 Revert "r226086 - Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers""
Reapply r226071 with fixes. Two fixes:

 1. We need to manually remove the old and create the new 'deaf defs'
    associated with physical register definitions when we move the definition of
    the physical register from the copy point to the point of the original vreg def.

    This problem was picked up by the machinstr verifier, and could trigger a
    verification failure on test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll, so I've
    turned on the verifier in the tests.

 2. When moving the def point of the phys reg up, we need to make sure that it
    is neither defined nor read in between the two instructions. We don't, however,
    extend the live ranges of phys reg defs to cover uses, so just checking for
    live-range overlap between the pair interval and the phys reg aliases won't
    pick up reads. As a result, we manually iterate over the range and check for
    reads.

    A test soon to be committed to the PowerPC backend will test this change.

Original commit message:

[RegisterCoalescer] Remove copies to reserved registers

This allows the RegisterCoalescer to join "non-flipped" range pairs with a
physical destination register -- which allows the RegisterCoalescer to remove
copies like this:

<vreg> = something (maybe a load, for example)
... (things that don't use PHYSREG)
PHYSREG = COPY <vreg>

(with all of the restrictions normally applied by the RegisterCoalescer: having
compatible register classes, etc. )

Previously, the RegisterCoalescer handled only the opposite case (copying
*from* a physical register). I don't handle the problem fully here, but try to
get the common case where there is only one use of <vreg> (the COPY).

An upcoming commit to the PowerPC backend will make this pattern much more
common on PPC64/ELF systems.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226200 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-15 20:32:09 +00:00
Timur Iskhodzhanov
d048b3be70 Revert Don't create new comdats in CodeGen
It breaks AddressSanitizer on Windows.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226173 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-15 16:14:34 +00:00
Hal Finkel
f908e37144 Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers"
Reverting this while I investigate some bad behavior this is causing. As a
possibly-related issue, adding -verify-machineinstrs to one of the test cases
now fails because of this change:

  llc test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll -march=x86-64 -o - -verify-machineinstrs

*** Bad machine code: No instruction at def index ***
- function:    foo
- basic block: BB#0 return (0x10007e21f10) [0B;736B)
- liverange:   [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78
4r,784d:0)  0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r
- register:    %DS
Valno #3 is defined at 624r

*** Bad machine code: Live segment doesn't end at a valid instruction ***
- function:    foo
- basic block: BB#0 return (0x10007e21f10) [0B;736B)
- liverange:   [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78
4r,784d:0)  0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r
- register:    %DS
[624r,624d:3)
LLVM ERROR: Found 2 machine code errors.

where 624r corresponds exactly to the interval combining change:

624B    %RSP<def> = COPY %vreg16; GR64:%vreg16
        Considering merging %vreg16 with %RSP
                RHS = %vreg16 [608r,624r:0)  0@608r
                updated: 608B   %RSP<def> = MOV64rm <fi#3>, 1, %noreg, 0, %noreg; mem:LD8[%saved_stack.1]
        Success: %vreg16 -> %RSP
        Result = %RSP

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226086 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-15 03:08:59 +00:00
Hal Finkel
47ab8c106f [RegisterCoalescer] Remove copies to reserved registers
This allows the RegisterCoalescer to join "non-flipped" range pairs with a
physical destination register -- which allows the RegisterCoalescer to remove
copies like this:

<vreg> = something (maybe a load, for example)
... (things that don't use PHYSREG)
PHYSREG = COPY <vreg>

(with all of the restrictions normally applied by the RegisterCoalescer: having
compatible register classes, etc. )

Previously, the RegisterCoalescer handled only the opposite case (copying
*from* a physical register). I don't handle the problem fully here, but try to
get the common case where there is only one use of <vreg> (the COPY).

An upcoming commit to the PowerPC backend will make this pattern much more
common on PPC64/ELF systems.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226071 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-15 01:25:28 +00:00
Duncan P. N. Exon Smith
37ac8d3622 IR: Move MDLocation into place
This commit moves `MDLocation`, finishing off PR21433.  There's an
accompanying clang commit for frontend testcases.  I'll attach the
testcase upgrade script I used to PR21433 to help out-of-tree
frontends/backends.

This changes the schema for `DebugLoc` and `DILocation` from:

    !{i32 3, i32 7, !7, !8}

to:

    !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8)

Note that empty fields (line/column: 0 and inlinedAt: null) don't get
printed by the assembly writer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226048 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-14 22:27:36 +00:00
Rafael Espindola
33f5127540 Don't create new comdats in CodeGen.
This patch stops the implicit creation of comdats during codegen.

Clang now sets the comdat explicitly when it is required. With this patch clang and gcc
now produce the same result in pr19848.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226038 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-14 20:55:48 +00:00