uses.
Also splitting the buildSources part allows more overloads such as
adding MachineOperands directly in the arguments for buildInstr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309163 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: We can use the template parameter `IsPostDom` to pick an appropriate SmallVector size to store DomTree roots for dominators and postdominators. Before, the code would always allocate memory with `std::vector`.
Reviewers: dberlin, davide, sanjoy, grosser
Reviewed By: grosser
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35636
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309148 91177308-0d34-0410-b5e6-96231b3b80d8
The ARM bots have started failing and while this patch should be an improvement
for these bots, it's also the only suspect in the blamelist. Reverting while
Diana and I investigate the problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309111 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Now that we have control flow in place, fuse the per-rule tables into a
single table. This is a compile-time saving at this point. However, this will
also enable the optimization of a table so that similar instructions can be
tested together, reducing the time spent on the matching the code.
This is NFC in terms of externally visible behaviour but some internals have
changed slightly. State.MIs is no longer reset between each rule that is
attempted because it's not necessary to do so. As a consequence of this the
restriction on the order that instructions are added to State.MIs has been
relaxed to only affect recorded instructions that require new elements to be
added to the vector. GIM_RecordInsn can now write to any element from 1 to
State.MIs.size() instead of just State.MIs.size().
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D35681
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309094 91177308-0d34-0410-b5e6-96231b3b80d8
This patch moves the DAGCombiner::GetDemandedBits function to SelectionDAG::GetDemandedBits as a first step towards making it easier for targets to get to the source of any demanded bits without the limitations of SimplifyDemandedBits.
Differential Revision: https://reviews.llvm.org/D35841
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308983 91177308-0d34-0410-b5e6-96231b3b80d8
This patch makes LSR generate better code for SystemZ in the cases of memory
intrinsics, Load->Store pairs or comparison of immediate with memory.
In order to achieve this, the following common code changes were made:
* New TTI hook: LSRWithInstrQueries(), which defaults to false. Controls if
LSR should do instruction-based addressing evaluations by calling
isLegalAddressingMode() with the Instruction pointers.
* In LoopStrengthReduce: handle address operands of memset, memmove and memcpy
as address uses, and call isFoldableMemAccessOffset() for any LSRUse::Address,
not just loads or stores.
SystemZ changes:
* isLSRCostLess() implemented with Insns first, and without ImmCost.
* New function supportedAddressingMode() that is a helper for TTI methods
looking at Instructions passed via pointers.
Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D35262https://reviews.llvm.org/D35049
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308729 91177308-0d34-0410-b5e6-96231b3b80d8
On AMDGPU SGPR spills are really spilled to another register.
The spiller creates the spills to new frame index objects,
which is used as a placeholder.
This will eventually be replaced with a reference to a position
in a VGPR to write to and the frame index deleted. It is
most likely not a real stack location that can be shared
with another stack object.
This is a problem when StackSlotColoring decides it should
combine a frame index used for a normal VGPR spill with
a real stack location and a frame index used for an SGPR.
Add an ID field so that StackSlotColoring has a way
of knowing the different frame index types are
incompatible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308673 91177308-0d34-0410-b5e6-96231b3b80d8
It seems that G++ 4.8 doesn't accept the 'enum A' in code of the form:
enum A { ... };
const auto &F = []() -> enum A { ... };
However, it does accept:
typedef enum { ... } A;
const auto &F = []() -> A { ... };
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308599 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This will allow us to merge the various sub-tables into a single table. This is a
compile-time saving at this point. However, this will also enable the optimization
of a table so that similar instructions can be tested together, reducing the time
spent on the matching the code.
The bulk of this patch is a mechanical conversion to the new MatchTable object
which is responsible for tracking label definitions and filling in the index of
the jump targets. It is also responsible for nicely formatting the table.
This was necessary to support the new GIM_Try opcode which takes the index to
jump to if the match should fail. This value is unknown during table
construction and is filled in during emission. To support nesting try-blocks
(although we currently don't emit tables with nested try-blocks), GIM_Reject
has been re-introduced to explicitly exit a try-block or fail the overall match
if there are no active try-blocks.
Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D35117
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308596 91177308-0d34-0410-b5e6-96231b3b80d8
Now, getUserCost() only checks the src and dst types of EXT to decide it is free
or not. This change first checks the types, then calls isExtFreeImpl(), and
check if EXT can form ExtLoad at last. Currently, only AArch64 has customized
implementation of isExtFreeImpl() to check if EXT can be folded into its use.
Differential Revision: https://reviews.llvm.org/D34458
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308076 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
DominatorTreeBase used to have IsPostDominators (bool) member to indicate if the tree is a dominator or a postdominator tree. This made it possible to switch between the two 'modes' at runtime, but it isn't used in practice anywhere.
This patch makes IsPostDominator a template argument. This way, it is easier to switch between different algorithms at compile-time based on this argument and design external utilities around it. It also makes it impossible to incidentally assign a postdominator tree to a dominator tree (and vice versa), and to further simplify template code in GenericDominatorTreeConstruction.
Reviewers: dberlin, sanjoy, davide, grosser
Reviewed By: dberlin
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D35315
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308040 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Add target hooks for printing and parsing target MMO flags.
Targets may override getSerializableMachineMemOperandTargetFlags() to
return a mapping from string to flag value for target MMO values that
should be serialized/parsed in MIR output.
Add implementation of this hook for AArch64 SuppressPair MMO flag.
Reviewers: bogner, hfinkel, qcolombet, MatzeB
Subscribers: mcrosier, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D34962
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307877 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memset intrinsic. This intrinsic is essentially memset with the implementation requirement that all stores used for the assignment are done with unordered-atomic stores of a given element size.
Reviewers: eli.friedman, reames, mkazantsev, skatkov
Reviewed By: reames
Subscribers: jfb, dschuff, sbc100, jgravelle-google, aheejin, efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D34885
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307854 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memmove intrinsic. This intrinsic is essentially memmove with the implementation requirement that all loads/stores used for the copy are done with unordered-atomic loads/stores of a given element size.
Reviewers: eli.friedman, reames, mkazantsev, skatkov
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34884
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307796 91177308-0d34-0410-b5e6-96231b3b80d8
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
global and local memory. These scopes restrict how synchronization is
achieved, which can result in improved performance.
This change extends existing notion of synchronization scopes in LLVM to
support arbitrary scopes expressed as target-specific strings, in addition to
the already defined scopes (single thread, system).
The LLVM IR and MIR syntax for expressing synchronization scopes has changed
to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
replaces *singlethread* keyword), or a target-specific name. As before, if
the scope is not specified, it defaults to CrossThread/System scope.
Implementation details:
- Mapping from synchronization scope name/string to synchronization scope id
is stored in LLVM context;
- CrossThread/System and SingleThread scopes are pre-defined to efficiently
check for known scopes without comparing strings;
- Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
the bitcode.
Differential Revision: https://reviews.llvm.org/D21723
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307722 91177308-0d34-0410-b5e6-96231b3b80d8
TreePatternNode considers them to be plain integers but MachineInstr considers
them to be a distinct kind of operand.
The tweak to AArch64InstrInfo.td to produce a simple test case is a NFC for
everything except GlobalISelEmitter (confirmed by diffing the tablegenerated
files). GlobalISelEmitter is currently unable to infer the type of operands in
the Dst pattern from the operands in the Src pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307634 91177308-0d34-0410-b5e6-96231b3b80d8
Contrary to the stepForward()/stepBackward() method accumulate() doesn't
have a direction as defs, uses and clobbers all have the same effect.
Also improve the documentation comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307351 91177308-0d34-0410-b5e6-96231b3b80d8
Allows the MachineIRBuilder APIs to directly create registers (based on
LLT or TargetRegisterClass) as well as accept MachineInstrBuilders
and implicitly converts to register(with getOperand(0).getReg()).
Eg usage:
LLT s32 = LLT::scalar(32);
auto C32 = Builder.buildConstant(s32, 32);
auto Tmp = Builder.buildInstr(TargetOpcode::G_SUB, s32, C32,
OtherReg);
auto Tmp2 = Builder.buildInstr(Opcode, DstReg,
Builder.buildConstant(s32, 31)); ....
Only a few methods added for now.
Reviewed by Tim
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307302 91177308-0d34-0410-b5e6-96231b3b80d8
Going through the Constant methods requires redetermining that the Constant is a ConstantInt and then calling isZero/isOne/isMinusOne.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307292 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Also, made a few minor tweaks to shave off a little more cumulative memory consumption:
* All rules share a single NewMIs instead of constructing their own. Only one
will end up using it.
* Use MIs.resize(1) instead of MIs.clear();MIs.push_back(I) and prevent
GIM_RecordInsn from changing MIs[0].
Depends on D33764
Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D33766
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307159 91177308-0d34-0410-b5e6-96231b3b80d8
We used to have a helper that replaced an instruction with a libcall.
That turns out to be too aggressive, since sometimes we need to replace
the instruction with at least two libcalls. Therefore, change our
existing helper to only create the libcall and leave the instruction
removal as a separate step. Also rename the helper accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307149 91177308-0d34-0410-b5e6-96231b3b80d8
Add a helper for building simple binary ops like add, mul, sub, and.
This can be used in the future for quickly adding support for or, xor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307139 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This further improves the compile-time regressions that will be caused by a
re-commit of r303259.
Also added included preliminary work in preparation for the multi-insn emitter
since I needed to change the relevant part of the API for this patch anyway.
Depends on D33758
Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D33764
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307133 91177308-0d34-0410-b5e6-96231b3b80d8
Using NumPatternEmitted as a unique id for the tables is not valid on release
builds since the counters don't count in that case.
Also fix an unused variable warning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307088 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Replace the matcher if-statements for each rule with a state-machine. This
significantly reduces compile time, memory allocations, and cumulative memory
allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is
recommitted.
The following patches will expand on this further to fully fix the regressions.
Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar
Reviewed By: ab
Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D33758
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307079 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
To enable profile hotness information in diagnostics output, Clang takes
the option `-fdiagnostics-show-hotness` -- that's "diagnostics", with an
"s" at the end. Clang also defines `CodeGenOptions::DiagnosticsWithHotness`.
LLVM, on the other hand, defines
`LLVMContext::getDiagnosticHotnessRequested` -- that's "diagnostic", not
"diagnostics". It's a small difference, but it's confusing, typo-inducing, and
frustrating.
Add a new method with the spelling "diagnostics", and "deprecate" the
old spelling.
Reviewers: anemet, davidxl
Reviewed By: anemet
Subscribers: llvm-commits, mehdi_amini
Differential Revision: https://reviews.llvm.org/D34864
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306848 91177308-0d34-0410-b5e6-96231b3b80d8
In r301116, a custom lowering needed to be introduced to be able to
legalize 8 and 16-bit divisions on ARM targets without a division
instruction, since 2-step legalization (WidenScalar from 8 bit to 32
bit, then Libcall the 32-bit division) doesn't work.
This fixes this and makes this kind of multi-step legalization, where
first the size of the type needs to be changed and then some action is
needed that doesn't require changing the size of the type,
straighforward to specify.
Differential Revision: https://reviews.llvm.org/D32529
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306806 91177308-0d34-0410-b5e6-96231b3b80d8
The style guide states that the explicit `inline`
should not be used with inline methods. classof is
very common inline method with a fair amount on
inconsistency:
$ git grep classof ./include | grep inline | wc -l
230
$ git grep classof ./include | grep -v inline | wc -l
257
I chose to target this method rather the larger change
since this method is easily cargo-culted (I did it at
least once). I considered doing the larger change and
removing all occurrences but that would be a much larger
change.
Differential Revision: https://reviews.llvm.org/D33906
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306731 91177308-0d34-0410-b5e6-96231b3b80d8
Relanding after restricting equalBaseIndex to not erroneuosly consider
a FrameIndices stemming from alloca from being comparable as its
offset is set post-selectionDAG.
Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to
general BaseIndexOffset.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306688 91177308-0d34-0410-b5e6-96231b3b80d8
CFI instructions that set appropriate cfa offset and cfa register are now
inserted in emitEpilogue() in X86FrameLowering.
Majority of the changes in this patch:
1. Ensure that CFI instructions do not affect code generation.
2. Enable maintaining correct information about cfa offset and cfa register
in a function when basic blocks are reordered, merged, split, duplicated.
These changes are target independent and described below.
Changed CFI instructions so that they:
1. are duplicable
2. are not counted as instructions when tail duplicating or tail merging
3. can be compared as equal
Add information to each MachineBasicBlock about cfa offset and cfa register
that are valid at its entry and exit (incoming and outgoing CFI info). Add
support for updating this information when basic blocks are merged, split,
duplicated, created. Add a verification pass (CFIInfoVerifier) that checks
that outgoing cfa offset and register of predecessor blocks match incoming
values of their successors.
Incoming and outgoing CFI information is used by a late pass
(CFIInstrInserter) that corrects CFA calculation rule for a basic block if
needed. That means that additional CFI instructions get inserted at basic
block beginning to correct the rule for calculating CFA. Having CFI
instructions in function epilogue can cause incorrect CFA calculation rule
for some basic blocks. This can happen if, due to basic block reordering,
or the existence of multiple epilogue blocks, some of the blocks have wrong
cfa offset and register values set by the epilogue block above them.
Patch by Violeta Vukobrat.
Differential Revision: https://reviews.llvm.org/D18046
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306529 91177308-0d34-0410-b5e6-96231b3b80d8