llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-04-13 21:10:39 +00:00

History

Simon Dardis 0a381d6586 [SelectionDAG] Enable target specific vector scalarization of calls and returns

By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.

The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.

Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.

By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.

Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".

This patch enables the MIPS backend to take either form for vector types.

Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur

Differential Revision: https://reviews.llvm.org/D27845

llvm-svn: 299766

2017-04-07 13:03:52 +00:00

AsmPrinter

Move llvm::canBeOmittedFromSymbolTable() to Analysis.

2017-03-31 04:46:31 +00:00

GlobalISel

[tablegen][globalisel] Add support for nested instruction matching.

2017-04-04 13:25:23 +00:00

MIRParser

[MIR] Support Customed Register Mask and CSRs

2017-03-19 08:14:18 +00:00

SelectionDAG

[SelectionDAG] Enable target specific vector scalarization of calls and returns

2017-04-07 13:03:52 +00:00

AggressiveAntiDepBreaker.cpp

getPristineRegs is not accurately considering shrink wrapping puts

2017-03-30 22:34:20 +00:00

AggressiveAntiDepBreaker.h

…

AllocationOrder.cpp

Use the range variant of find instead of unpacking begin/end

2016-08-11 22:21:41 +00:00

AllocationOrder.h

Use the range variant of find instead of unpacking begin/end

2016-08-11 22:21:41 +00:00

Analysis.cpp

Move llvm::canBeOmittedFromSymbolTable() to Analysis.

2017-03-31 04:46:31 +00:00

AntiDepBreaker.h

…

AtomicExpandPass.cpp

Rename AttributeSet to AttributeList

2017-03-21 16:57:19 +00:00

BasicTargetTransformInfo.cpp

…

BranchCoalescing.cpp

Strip trailing whitespace.

2017-03-10 22:53:19 +00:00

BranchFolding.cpp

NFC: Reformats comments according to the coding guildelines.

2017-03-15 06:29:23 +00:00

BranchFolding.h

NFC: Reformats comments according to the coding guildelines.

2017-03-15 06:29:23 +00:00

BranchRelaxation.cpp

Cleanup dump() functions.

2017-01-28 02:02:38 +00:00

BuiltinGCs.cpp

[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-02-27 22:45:06 +00:00

CalcSpillWeights.cpp

CodeGen: Use MachineInstr& in TargetInstrInfo, NFC

2016-06-30 00:01:54 +00:00

CallingConvLower.cpp

[CodeGen] Remove dead call-or-prologue enum from CCState

2017-02-02 21:58:22 +00:00

CMakeLists.txt

[Outliner] Fixed Asan bot failure in r296418

2017-03-06 21:31:18 +00:00

CodeGen.cpp

CodeGen.cpp: Sort alphabetically; NFC

2017-03-18 05:05:32 +00:00

CodeGenPrepare.cpp

Turn on -addr-sink-using-gep by default.

2017-04-06 22:42:18 +00:00

CountingFunctionInserter.cpp

Revert "Turn some C-style vararg into variadic templates"

2017-04-06 20:23:57 +00:00

CriticalAntiDepBreaker.cpp

getPristineRegs is not accurately considering shrink wrapping puts

2017-03-30 22:34:20 +00:00

CriticalAntiDepBreaker.h

…

DeadMachineInstructionElim.cpp

Fix typos

2017-02-15 22:19:06 +00:00

DetectDeadLanes.cpp

Spelling mistakes in comments. NFCI.

2017-03-30 12:59:53 +00:00

DFAPacketizer.cpp

[Packetizer] Add debugging code to stop packetization after N instructions

2016-08-19 21:12:52 +00:00

DwarfEHPrepare.cpp

Use StringRef in Pass/PassManager APIs (NFC)

2016-10-01 02:56:57 +00:00

EarlyIfConversion.cpp

Use StringRef in Pass/PassManager APIs (NFC)

2016-10-01 02:56:57 +00:00

EdgeBundles.cpp

Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes.

2016-08-25 00:45:04 +00:00

ExecutionDepsFix.cpp

[ExecutionDepsFix] Don't recurse over the CFG

2017-04-05 17:42:56 +00:00

ExpandISelPseudos.cpp

CodeGen: Use MachineInstr& in ExpandISelPseudos, NFC

2016-06-30 23:09:39 +00:00

ExpandPostRAPseudos.cpp

ExpandPostRAPseudos should transfer implicit uses, not only implicit defs

2016-07-15 22:31:14 +00:00

FaultMaps.cpp

[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-02-27 22:45:06 +00:00

FEntryInserter.cpp

[X86] Implement -mfentry

2017-01-31 17:00:27 +00:00

FuncletLayout.cpp

MachineFunctionProperties/MIRParser: Rename AllVRegsAllocated->NoVRegs, compute it

2016-08-25 01:27:13 +00:00

GCMetadata.cpp

Use StringRef in Pass/PassManager APIs (NFC)

2016-10-01 02:56:57 +00:00

GCMetadataPrinter.cpp

Reapply r276973 "Adjust Registry interface to not require plugins to export a registry"

2016-08-05 11:01:08 +00:00

GCRootLowering.cpp

Use StringRef in Pass/PassManager APIs (NFC)

2016-10-01 02:56:57 +00:00

GCStrategy.cpp

[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-02-27 22:45:06 +00:00

GlobalMerge.cpp

Simplify code and address review comments (NFC)

2016-11-11 22:09:25 +00:00

IfConversion.cpp

[IfConversion] Only renormalize probabilities if branches are analyzable

2017-03-06 19:12:42 +00:00

ImplicitNullChecks.cpp

[ImplicitNullCheck] Add alias analysis usage

2017-02-28 07:04:49 +00:00

InlineSpiller.cpp

CodeGen : Check LLVM_ENABLE_DUMP definition for dumpMachineInstrRangeWithSlotIndex.

2017-03-28 04:14:25 +00:00

InterferenceCache.cpp

…

InterferenceCache.h

…

InterleavedAccessPass.cpp

InterleaveAccessPass: Avoid constructing invalid shuffle masks

2017-01-31 18:37:53 +00:00

IntrinsicLowering.cpp

Revert "Turn some C-style vararg into variadic templates"

2017-04-06 20:23:57 +00:00

LatencyPriorityQueue.cpp

Use the range variant of find instead of unpacking begin/end

2016-08-11 22:21:41 +00:00

LazyMachineBlockFrequencyInfo.cpp

[LazyMachineBFI] Reimplement with getAnalysisIfAvailable

2017-02-23 17:30:01 +00:00

LexicalScopes.cpp

[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-02-17 21:43:25 +00:00

LiveDebugValues.cpp

LiveDebugValues: Assume calls never clobber SP.

2017-03-03 01:08:25 +00:00

LiveDebugVariables.cpp

Cleanup dump() functions.

2017-01-28 02:02:38 +00:00

LiveDebugVariables.h

…

LiveInterval.cpp

RegisterCoalescer: Simplify subrange splitting code; NFC

2017-03-03 19:05:34 +00:00

LiveIntervalAnalysis.cpp

Fix subreg value numbers in handleMoveUp

2017-03-11 00:14:52 +00:00

LiveIntervalUnion.cpp

LIU:::Query: Query LiveRange instead of LiveInterval; NFC

2017-03-01 21:48:12 +00:00

LivePhysRegs.cpp

Disable Callee Saved Registers

2017-03-14 09:09:26 +00:00

LiveRangeCalc.cpp

RegisterCoalescer: Simplify subrange splitting code; NFC

2017-03-03 19:05:34 +00:00

LiveRangeCalc.h

Extract LaneBitmask into a separate type

2016-12-15 14:36:06 +00:00

LiveRangeEdit.cpp

[LiveRangeEdit] Don't mess up with LiveInterval when a new vreg is created.

2017-02-02 20:44:36 +00:00

LiveRangeUtils.h

CodeGen: Refactor renameDisconnectedComponents() as a pass

2016-05-31 22:38:06 +00:00

LiveRegMatrix.cpp

LiveRegMatrix: Fix some subreg interference checks

2017-03-02 00:35:08 +00:00

LiveRegUnits.cpp

[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-02-17 21:43:25 +00:00

LiveStackAnalysis.cpp

…

LiveVariables.cpp

Cleanup dump() functions.

2017-01-28 02:02:38 +00:00

LLVMBuild.txt

Prune unused libdeps.

2016-12-08 15:28:02 +00:00

LLVMTargetMachine.cpp

[GlobalISel] Add a way for targets to enable GISel.

2017-03-01 23:33:08 +00:00

LocalStackSlotAllocation.cpp

Fix nondeterministic output in local stack slot alloc pass

2016-10-26 14:53:50 +00:00

LowerEmuTLS.cpp

Re-commit optimization bisect support (r267022) without new pass manager support.

2016-04-22 22:06:11 +00:00

LowLevelType.cpp

Recommit: [globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it.

2017-03-07 23:20:35 +00:00

MachineBasicBlock.cpp

Refactor code to create getFallThrough method in MachineBasicBlock.

2017-03-31 15:55:37 +00:00

MachineBlockFrequencyInfo.cpp

include function name in dot filename

2017-02-15 19:21:04 +00:00

MachineBlockPlacement.cpp

Fix trellis layout to avoid mis-identify triangle.

2017-03-23 23:28:09 +00:00

MachineBranchProbabilityInfo.cpp

Use the range variant of find/find_if instead of unpacking begin/end

2016-08-12 03:55:06 +00:00

MachineCombiner.cpp

Fix up grammar in a comment.

2017-03-15 21:50:46 +00:00

MachineCopyPropagation.cpp

MachineCopyPropagation: Respect implicit operands of COPY

2017-02-04 02:27:20 +00:00

MachineCSE.cpp

[codegen] Add generic functions to skip debug values.

2016-12-16 11:10:26 +00:00

MachineDominanceFrontier.cpp

…

MachineDominators.cpp

Do not verify MachimeDominatorTree if it is not calculated

2017-03-02 12:00:10 +00:00

MachineFunction.cpp

Disable Callee Saved Registers

2017-03-14 09:09:26 +00:00

MachineFunctionPass.cpp

Reverted: Track validity of pass results

2017-01-15 10:23:18 +00:00

MachineFunctionPrinterPass.cpp

Use StringRef in Pass/PassManager APIs (NFC)

2016-10-01 02:56:57 +00:00

MachineInstr.cpp

GlobalISel: allow quad-precision values to be dumped.

2017-03-20 16:52:08 +00:00

MachineInstrBundle.cpp

CodeGen/Passes: Pass MachineFunction as functor arg; NFC

2016-10-24 23:23:02 +00:00

MachineLICM.cpp

When instructions are hoisted out of loops by MachineLICM, remove their debug loc.

2016-12-02 00:37:57 +00:00

MachineLoopInfo.cpp

New OptimizationRemarkEmitter pass for MIR

2017-01-25 23:20:33 +00:00

MachineModuleInfo.cpp

Implement FreeMachineFunction::getPassName().

2017-03-07 20:59:08 +00:00

MachineModuleInfoImpls.cpp

[WebAssembly] Add support for using a wasm global for the stack pointer.

2017-02-24 23:46:05 +00:00

MachineOptimizationRemarkEmitter.cpp

[CodeGen] Teach opt remarks how to print MI instructions.

2017-02-23 21:05:33 +00:00

MachineOutliner.cpp

Revert "Turn some C-style vararg into variadic templates"

2017-04-06 20:23:57 +00:00

MachinePassRegistry.cpp

…

MachinePipeliner.cpp

Spelling mistakes in comments. NFCI.

2017-03-31 10:59:37 +00:00

MachinePostDominators.cpp

…

MachineRegionInfo.cpp

MachineRegionInfo: Fix pass initialization

2017-02-18 00:41:16 +00:00

MachineRegisterInfo.cpp

[MIR] Support Customed Register Mask and CSRs

2017-03-19 08:14:18 +00:00

MachineScheduler.cpp

Improve machine schedulers for in-order processors

2017-03-27 20:46:37 +00:00

MachineSink.cpp

MachineRegisterInfo: Remove unused arg from isConstantPhysReg(); NFC

2016-10-28 18:05:09 +00:00

MachineSSAUpdater.cpp

Retire llvm::alignOf in favor of C++11 alignof.

2016-10-20 15:02:18 +00:00

MachineTraceMetrics.cpp

[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-02-21 22:07:52 +00:00

MachineVerifier.cpp

[MachineVerifier] Drop a spurious const

2017-03-29 15:25:06 +00:00

MIRPrinter.cpp

[MIR] Support Customed Register Mask and CSRs

2017-03-19 08:14:18 +00:00

MIRPrinter.h

…

MIRPrintingPass.cpp

Use StringRef in Pass/PassManager APIs (NFC)

2016-10-01 02:56:57 +00:00

OptimizePHIs.cpp

CodeGen: Avoid dereferencing end() in OptimizePHIs::OptimizeBB

2016-08-17 00:43:59 +00:00

ParallelCG.cpp

Bitcode: Change module reader functions to return an llvm::Expected.

2016-11-13 07:00:17 +00:00

PatchableFunction.cpp

[CodeGen] Rename MachineInstrBuilder::addOperand. NFC

2017-01-13 09:58:52 +00:00

PeepholeOptimizer.cpp

PeepholeOptimizer: Do not replace SubregToReg(bitcast like)

2017-01-09 21:38:17 +00:00

PHIElimination.cpp

MachineFunction: Introduce NoPHIs property

2016-08-23 21:19:49 +00:00

PHIEliminationUtils.cpp

Place the lowered phi instruction(s) before the DEBUG_VALUE entry

2016-09-16 14:07:29 +00:00

PHIEliminationUtils.h

…

PostRAHazardRecognizer.cpp

CodeGen: Use MachineInstr& in PostRAHazardRecognizer, NFC

2016-07-01 00:50:29 +00:00

PostRASchedulerList.cpp

Cleanup dump() functions.

2017-01-28 02:02:38 +00:00

PreISelIntrinsicLowering.cpp

[PM] Port PreISelIntrinsicLowering to the new PM

2016-06-24 20:13:42 +00:00

ProcessImplicitDefs.cpp

…

PrologEpilogInserter.cpp

Disable Callee Saved Registers

2017-03-14 09:09:26 +00:00

PseudoSourceValue.cpp

Fix crashing on TargetCustom PseudoSourceValues

2017-03-28 20:33:12 +00:00

README.txt

…

RegAllocBase.cpp

Timer: Track name and description.

2016-11-18 19:43:18 +00:00

RegAllocBase.h

Timer: Track name and description.

2016-11-18 19:43:18 +00:00

RegAllocBasic.cpp

LIU::Query: Remove always false member+getter; NFC

2017-03-01 21:02:52 +00:00

RegAllocFast.cpp

Use StringRef in Pass/PassManager APIs (NFC)

2016-10-01 02:56:57 +00:00

RegAllocGreedy.cpp

RegAllocGreedy: Follow-up to r296722

2017-03-03 23:27:20 +00:00

RegAllocPBQP.cpp

Disable Callee Saved Registers

2017-03-14 09:09:26 +00:00

RegisterClassInfo.cpp

Disable Callee Saved Registers

2017-03-14 09:09:26 +00:00

RegisterCoalescer.cpp

RegisterCoalescer: Simplify subrange splitting code; NFC

2017-03-03 19:05:34 +00:00

RegisterCoalescer.h

…

RegisterPressure.cpp

Revert "Correct register pressure calculation in presence of subregs"

2017-02-24 21:56:16 +00:00

RegisterScavenging.cpp

[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-02-21 22:07:52 +00:00

RegisterUsageInfo.cpp

Move helpers into anonymous namespaces. NFC.

2016-08-06 11:13:10 +00:00

RegUsageInfoCollector.cpp

[IPRA] Change algorithm for RegUsageInfoCollector.

2017-03-13 21:42:53 +00:00

RegUsageInfoPropagate.cpp

Use StringRef in Pass/PassManager APIs (NFC)

2016-10-01 02:56:57 +00:00

RenameIndependentSubregs.cpp

Extract LaneBitmask into a separate type

2016-12-15 14:36:06 +00:00

ResetMachineFunctionPass.cpp

GlobalISel: Abort in ResetMachineFunctionPass if fallback isn't enabled

2017-01-13 23:46:11 +00:00

SafeStack.cpp

Revert "Turn some C-style vararg into variadic templates"

2017-04-06 20:23:57 +00:00

SafeStackColoring.cpp

Cleanup dump() functions.

2017-01-28 02:02:38 +00:00

SafeStackColoring.h

StackColoring for SafeStack.

2016-06-29 20:37:43 +00:00

SafeStackLayout.cpp

[safestack] Layout large allocas first to reduce fragmentation.

2016-08-02 23:21:30 +00:00

SafeStackLayout.h

StackColoring for SafeStack.

2016-06-29 20:37:43 +00:00

ScheduleDAG.cpp

MachineScheduler/ScheduleDAG: Add support for GetSubGraph

2017-03-28 05:12:31 +00:00

ScheduleDAGInstrs.cpp

Refactor alias check from MISched into common helper. NFC.

2017-03-09 23:33:36 +00:00

ScheduleDAGPrinter.cpp

…

ScoreboardHazardRecognizer.cpp

[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-02-22 22:32:51 +00:00

ShadowStackGCLowering.cpp

[tsan] Add support for C++ exceptions into TSan (call __tsan_func_exit during unwinding), LLVM part

2016-11-14 21:41:13 +00:00

ShrinkWrap.cpp

Use StringRef in Pass/PassManager APIs (NFC)

2016-10-01 02:56:57 +00:00

SjLjEHPrepare.cpp

Revert "Turn some C-style vararg into variadic templates"

2017-04-06 20:23:57 +00:00

SlotIndexes.cpp

VirtRegMap: Correctly deal with bundles when deleting identity copies.

2017-03-17 00:41:33 +00:00

Spiller.h

…

SpillPlacement.cpp

Reapply r263460: [SpillPlacement] Fix a quadratic behavior in spill placement.

2016-05-19 22:40:37 +00:00

SpillPlacement.h

Reapply r263460: [SpillPlacement] Fix a quadratic behavior in spill placement.

2016-05-19 22:40:37 +00:00

SplitKit.cpp

SplitKit: Fix subreg copy related problems

2017-03-21 21:58:08 +00:00

SplitKit.h

SplitKit: Correctly implement partial subregister copies

2017-03-17 00:41:39 +00:00

StackColoring.cpp

[StackColoring] Remove unused header file for post-order traversal. Update comment that indicated we were using it when we really use a depth-first search. NFC

2017-03-15 22:40:26 +00:00

StackMapLivenessAnalysis.cpp

LivePhysReg: Use reference instead of pointer in init(); NFC

2016-12-08 00:15:51 +00:00

StackMaps.cpp

[CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).

2017-02-22 22:32:51 +00:00

StackProtector.cpp

Revert "Turn some C-style vararg into variadic templates"

2017-04-06 20:23:57 +00:00

StackSlotColoring.cpp

In the below scenario, we must be able to skip the a DBG_VALUE instruction and

2017-01-09 17:45:02 +00:00

TailDuplication.cpp

Codegen: Tail-duplicate during placement.

2016-10-11 20:36:43 +00:00

TailDuplicator.cpp

[TailDuplicator] Maintain DebugLoc for branch instructions

2017-02-27 19:30:01 +00:00

TargetFrameLoweringImpl.cpp

Disable Callee Saved Registers

2017-03-14 09:09:26 +00:00

TargetInstrInfo.cpp

[CodeGen] Rename MachineInstrBuilder::addOperand. NFC

2017-01-13 09:58:52 +00:00

TargetLoweringBase.cpp

[SelectionDAG] Enable target specific vector scalarization of calls and returns

2017-04-07 13:03:52 +00:00

TargetLoweringObjectFileImpl.cpp

Move llvm::emitLinkerFlagsForGlobalCOFF() to Mangler.

2017-03-31 04:46:50 +00:00

TargetOptionsImpl.cpp

Remove LessPreciseFPMADOption from TargetOptions along with all of the

2017-03-17 00:38:03 +00:00

TargetPassConfig.cpp

Allow targets to opt-in to codegen in SCC order

2017-04-04 23:44:46 +00:00

TargetRegisterInfo.cpp

Revert "Correct register pressure calculation in presence of subregs"

2017-02-24 21:56:16 +00:00

TargetSchedule.cpp

Improve machine schedulers for in-order processors

2017-03-27 20:46:37 +00:00

TargetSubtargetInfo.cpp

TargetSubtargetInfo: Move implementation to lib/CodeGen; NFC

2016-11-22 22:09:03 +00:00

TwoAddressInstructionPass.cpp

[TwoAddressInstruction] Fix typo in comment. NFC

2017-02-04 01:58:10 +00:00

UnreachableBlockElim.cpp

Modify df_iterator to support post-order actions

2016-10-05 21:36:16 +00:00

VirtRegMap.cpp

SplitKit: Fix subreg copy related problems

2017-03-21 21:58:08 +00:00

WinEHPrepare.cpp

[WinEH] Avoid holding references to BlockColor (DenseMap) entries while inserting new elements

2016-12-14 19:30:18 +00:00

XRayInstrumentation.cpp

[LLVM][XRAY][MIPS] Support xray on mips/mipsel/mips64/mips64el

2017-02-15 10:48:11 +00:00

README.txt

//===---------------------------------------------------------------------===//

Common register allocation / spilling problem:

        mul lr, r4, lr
        str lr, [sp, #+52]
        ldr lr, [r1, #+32]
        sxth r3, r3
        ldr r4, [sp, #+52]
        mla r4, r3, lr, r4

can be:

        mul lr, r4, lr
        mov r4, lr
        str lr, [sp, #+52]
        ldr lr, [r1, #+32]
        sxth r3, r3
        mla r4, r3, lr, r4

and then "merge" mul and mov:

        mul r4, r4, lr
        str r4, [sp, #+52]
        ldr lr, [r1, #+32]
        sxth r3, r3
        mla r4, r3, lr, r4

It also increase the likelihood the store may become dead.

//===---------------------------------------------------------------------===//

bb27 ...
        ...
        %reg1037 = ADDri %reg1039, 1
        %reg1038 = ADDrs %reg1032, %reg1039, %NOREG, 10
    Successors according to CFG: 0x8b03bf0 (#5)

bb76 (0x8b03bf0, LLVM BB @0x8b032d0, ID#5):
    Predecessors according to CFG: 0x8b0c5f0 (#3) 0x8b0a7c0 (#4)
        %reg1039 = PHI %reg1070, mbb<bb76.outer,0x8b0c5f0>, %reg1037, mbb<bb27,0x8b0a7c0>

Note ADDri is not a two-address instruction. However, its result %reg1037 is an
operand of the PHI node in bb76 and its operand %reg1039 is the result of the
PHI node. We should treat it as a two-address code and make sure the ADDri is
scheduled after any node that reads %reg1039.

//===---------------------------------------------------------------------===//

Use local info (i.e. register scavenger) to assign it a free register to allow
reuse:
        ldr r3, [sp, #+4]
        add r3, r3, #3
        ldr r2, [sp, #+8]
        add r2, r2, #2
        ldr r1, [sp, #+4]  <==
        add r1, r1, #1
        ldr r0, [sp, #+4]
        add r0, r0, #2

//===---------------------------------------------------------------------===//

LLVM aggressively lift CSE out of loop. Sometimes this can be negative side-
effects:

R1 = X + 4
R2 = X + 7
R3 = X + 15

loop:
load [i + R1]
...
load [i + R2]
...
load [i + R3]

Suppose there is high register pressure, R1, R2, R3, can be spilled. We need
to implement proper re-materialization to handle this:

R1 = X + 4
R2 = X + 7
R3 = X + 15

loop:
R1 = X + 4  @ re-materialized
load [i + R1]
...
R2 = X + 7 @ re-materialized
load [i + R2]
...
R3 = X + 15 @ re-materialized
load [i + R3]

Furthermore, with re-association, we can enable sharing:

R1 = X + 4
R2 = X + 7
R3 = X + 15

loop:
T = i + X
load [T + 4]
...
load [T + 7]
...
load [T + 15]
//===---------------------------------------------------------------------===//

It's not always a good idea to choose rematerialization over spilling. If all
the load / store instructions would be folded then spilling is cheaper because
it won't require new live intervals / registers. See 2003-05-31-LongShifts for
an example.

//===---------------------------------------------------------------------===//

With a copying garbage collector, derived pointers must not be retained across
collector safe points; the collector could move the objects and invalidate the
derived pointer. This is bad enough in the first place, but safe points can
crop up unpredictably. Consider:

        %array = load { i32, [0 x %obj] }** %array_addr
        %nth_el = getelementptr { i32, [0 x %obj] }* %array, i32 0, i32 %n
        %old = load %obj** %nth_el
        %z = div i64 %x, %y
        store %obj* %new, %obj** %nth_el

If the i64 division is lowered to a libcall, then a safe point will (must)
appear for the call site. If a collection occurs, %array and %nth_el no longer
point into the correct object.

The fix for this is to copy address calculations so that dependent pointers
are never live across safe point boundaries. But the loads cannot be copied
like this if there was an intervening store, so may be hard to get right.

Only a concurrent mutator can trigger a collection at the libcall safe point.
So single-threaded programs do not have this requirement, even with a copying
collector. Still, LLVM optimizations would probably undo a front-end's careful
work.

//===---------------------------------------------------------------------===//

The ocaml frametable structure supports liveness information. It would be good
to support it.

//===---------------------------------------------------------------------===//

The FIXME in ComputeCommonTailLength in BranchFolding.cpp needs to be
revisited. The check is there to work around a misuse of directives in inline
assembly.

//===---------------------------------------------------------------------===//

It would be good to detect collector/target compatibility instead of silently
doing the wrong thing.

//===---------------------------------------------------------------------===//

It would be really nice to be able to write patterns in .td files for copies,
which would eliminate a bunch of explicit predicates on them (e.g. no side 
effects).  Once this is in place, it would be even better to have tblgen 
synthesize the various copy insertion/inspection methods in TargetInstrInfo.

//===---------------------------------------------------------------------===//

Stack coloring improvements:

1. Do proper LiveStackAnalysis on all stack objects including those which are
   not spill slots.
2. Reorder objects to fill in gaps between objects.
   e.g. 4, 1, <gap>, 4, 1, 1, 1, <gap>, 4 => 4, 1, 1, 1, 1, 4, 4

//===---------------------------------------------------------------------===//

The scheduler should be able to sort nearby instructions by their address. For
example, in an expanded memset sequence it's not uncommon to see code like this:

  movl $0, 4(%rdi)
  movl $0, 8(%rdi)
  movl $0, 12(%rdi)
  movl $0, 0(%rdi)

Each of the stores is independent, and the scheduler is currently making an
arbitrary decision about the order.

//===---------------------------------------------------------------------===//

Another opportunitiy in this code is that the $0 could be moved to a register:

  movl $0, 4(%rdi)
  movl $0, 8(%rdi)
  movl $0, 12(%rdi)
  movl $0, 0(%rdi)

This would save substantial code size, especially for longer sequences like
this. It would be easy to have a rule telling isel to avoid matching MOV32mi
if the immediate has more than some fixed number of uses. It's more involved
to teach the register allocator how to do late folding to recover from
excessive register pressure.