In preparation for doing storemerge post-legalization, reorder
visitSTORE passes to move pre/post-index combining after store
merge. Reordered passes other than store merge are unaffected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305473 91177308-0d34-0410-b5e6-96231b3b80d8
As all store merges checks are based on the memory operation
performed, allow use of truncated stores and extended loads as valid
input candidates for merging.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305468 91177308-0d34-0410-b5e6-96231b3b80d8
Avoid non-legal memory ops by checking correct size when merging
stores of loads into a extload-truncstore pair.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305466 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for modulo for targets that have hardware division and for
those that don't. When hardware division is not available, we have to
choose the correct libcall to use. This is generally straightforward,
except for AEABI.
The AEABI variant is trickier than the other libcalls because it
returns { quotient, remainder }, instead of just one value like the
other libcalls that we've seen so far. Therefore, we need to use custom
lowering for it. However, we don't want to have too much special code,
so we refactor the target-independent code in the legalizer by adding a
helper for replacing an instruction with a libcall. This helper is used
by the legalizer itself when dealing with simple calls, and also by the
custom ARM legalization for the more complicated AEABI divmod calls.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305459 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
At present, `-profile-guided-section-prefix` is a `cl::Optional` option, which means it demands to be passed exactly zero or one times. Our build system makes it pretty tricky to guarantee this. We often accidentally pass the flag more than once (but always with the same "false" value) which results in an error, after which compilation fails:
```
clang (LLVM option parsing): for the -profile-guided-section-prefix option: may only occur zero or one times!
```
While we work on improving our build system, it also seems reasonable just to allow `-profile-guided-section-prefix` to be passed more than once, by to `cl::ZeroOrMore`. Quoting [[ http://llvm.org/docs/CommandLine.html#controlling-the-number-of-occurrences-required-and-allowed | the documentation ]]:
> The cl::ZeroOrMore modifier ... indicates that your program will allow the option to be specified zero or more times.
> ...
> If an option is specified multiple times for an option of the cl::opt class, only the last value will be retained.
Reviewers: danielcdh
Reviewed By: danielcdh
Subscribers: twoh, david2050, llvm-commits
Differential Revision: https://reviews.llvm.org/D34219
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305413 91177308-0d34-0410-b5e6-96231b3b80d8
For multiprecision arithmetic on MIPS, rather than using ISD::ADDE / ISD::ADDC,
get SelectionDAG to break down the operation into ISD::ADDs and ISD::SETCCs.
For MIPS, only the DSP ASE has a carry flag, so in the general case it is not
useful to directly support ISD::{ADDE, ADDC, SUBE, SUBC} nodes.
Also improve the generation code in such cases for targets with
TargetLoweringBase::ZeroOrOneBooleanContent by directly using the result of the
comparison node rather than using it in selects. Similarly for ISD::SUBE /
ISD::SUBC.
Address optimization breakage by moving the generation of MIPS specific integer
multiply-accumulate nodes to before legalization.
This revolves PR32713 and PR33424.
Thanks to Simonas Kazlauskas and Pirama Arumuga Nainar for reporting the issue!
Reviewers: slthakur
Differential Revision: https://reviews.llvm.org/D33494
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305389 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things.
The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack.
This is done in three stages:
• The first patch (LLVM) adds support for DW_OP_plus_uconst.
• The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst.
• The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions.
Patch by Sander de Smalen.
Reviewers: echristo, pcc, aprantl
Reviewed By: aprantl
Subscribers: fhahn, javed.absar, aprantl, llvm-commits
Differential Revision: https://reviews.llvm.org/D33894
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305386 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When legalizing G_LOAD/G_STORE using NarrowScalar, we should avoid emitting
%0 = G_CONSTANT ty 0
%1 = G_GEP %x, %0
since it's cheaper to not emit the redundant instructions than it is to fold them
away later.
Reviewers: qcolombet, t.p.northover, ab, rovka, aditya_nandakumar, kristof.beyls
Reviewed By: qcolombet
Subscribers: javed.absar, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D32746
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305340 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things.
The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack.
This is done in three stages:
• The first patch (LLVM) adds support for DW_OP_plus_uconst.
• The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst.
• The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions.
Patch by Sander de Smalen.
Reviewers: pcc, echristo, aprantl
Reviewed By: aprantl
Subscribers: fhahn, aprantl, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33892
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305304 91177308-0d34-0410-b5e6-96231b3b80d8
Fix thinko/typo in subreg aware liverange splitting logic. I'm not sure
how to write a proper testcase for this. The original problem only
happens on an out-of-tree target. Forcing subreg enabled targets to
spill and split in a predictable way is near impossible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305228 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change enables the sin(x) cos(x) -> sincos(x) optimization on GNU
target triples. This optimization was being inhibited when -ffast-math
wasn't set because sincos in GLibC does not set errno, while sin and cos
do. However, this optimization will only run if the attributes on the
sin/cos calls include readnone, which is how clang represents the fact
that it doesn't care about the errno values set by these functions (via
the -fno-math-errno flag).
Reviewers: hfinkel, bogner
Subscribers: mcrosier, javed.absar, llvm-commits, paul.redmond
Differential Revision: https://reviews.llvm.org/D32921
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305204 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The old check for slot overlap treated 2 slots `S` and `T` as
overlapping if there existed a CFG node in which both of the slots could
possibly be active. That is overly conservative and caused stack blowups
in Rust programs. Instead, check whether there is a single CFG node in
which both of the slots are possibly active *together*.
Fixes PR32488.
Patch by Ariel Ben-Yehuda <ariel.byd@gmail.com>
Reviewers: thanm, nagisa, llvm-commits, efriedma, rnk
Reviewed By: thanm
Subscribers: dotdash
Differential Revision: https://reviews.llvm.org/D31583
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305193 91177308-0d34-0410-b5e6-96231b3b80d8
This step is just intended to reduce code duplication rather than change any functionality.
A follow-up would be to replace PPCTargetLowering::spliceIntoChain() usage with this new helper.
Differential Revision: https://reviews.llvm.org/D33649
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305192 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: UADDO has 2 result, and one must check the result no before doing any kind of combine. Without it, the transform is invalid.
Reviewers: joerg
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34088
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305162 91177308-0d34-0410-b5e6-96231b3b80d8
We're currently passing endian-ness around as a param (and not uniformly),
so this eliminates the need for that. I'd like to add a constant fold
call too, and that requires a DL.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305129 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
During DAG legalization loop in SelectionDAG::Legalize(),
bookkeeping of the SDNodes that were already legalized is implemented
with SmallPtrSet (LegalizedNodes). This kind of set stores only pointers
to objects, not the objects themselves. Unfortunately, if SDNode is
deleted during legalization for some reason, LegalizedNodes set is not
informed about this fact. This wouldn’t be so bad, if SelectionDAG wouldn’t reuse
space deallocated after deletion of unused nodes, for creation of new
ones. Because of this, new nodes, created during legalization often can
have pointers identical to ones that have been previously legalized,
added to the LegalizedNodes set, and deleted afterwards. This in turn
causes, that newly created nodes, sharing the same pointer as deleted
old ones, are present in LegalizedNodes *already at the moment of
creation*, so we never call Legalize on them.
The fix facilitates the fact, that DAG notifies listeners about each
modification. I have registered DAGNodeDeletedListener inside
SelectionDAG::Legalize, with a callback function that removes any
pointer of any deleted SDNode from the LegalizedNodes set. With this
modification, LegalizeNodes set does not contain pointers to nodes that
were deleted, so newly created nodes can always be inserted to it, even
if they share pointers with old deleted nodes.
Patch by pawel.szczerbuk@intel.com
The issue this patch addresses causes failures in an out-of-tree target,
and i was not able to create a reproducer for an in-tree target, hence
there is no test-case.
Reviewers: delena, spatel, RKSimon, hfinkel, davide, qcolombet
Reviewed By: delena
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33891
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305084 91177308-0d34-0410-b5e6-96231b3b80d8
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.
The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.
Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.
By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.
Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".
This patch enables the MIPS backend to take either form for vector types.
The previous version of this patch had a "conditional move or jump depends on
uninitialized value".
Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur
Differential Revision: https://reviews.llvm.org/D27845
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305083 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Currently XRay compares its threshold against `Function::size()` . However, `Function::size()` returns the number of basic blocks (as I understand, such as cycle bodies, if/else bodies, switch-case bodies, etc.), rather than the number of instructions.
The name of the parameter `-fxray-instruction-threshold=N`, as well as XRay documentation at http://llvm.org/docs/XRay.html , suggests that instructions should be counted, rather than the number of basic blocks.
I see two options:
1. Count the number of MachineInstr`s in MachineFunction : this gives better estimate for the number of assembly instructions on the target. So a user can check in disassembly that the threshold works more or less correctly.
2. Count the number of Instruction`s in a Function : AFAIK, this gives correct number of IR instructions, which the user can check in IR listing. However, this number may be far (several times for small functions) from the number of assembly instructions finally emitted.
Option 1 is implemented in this patch because I think that having the closer estimate for the number of assembly instructions emitted is more important than to have a clear definition of the metric.
Reviewers: dberris, rengolin
Reviewed By: dberris
Subscribers: llvm-commits, iid_iunknown
Differential Revision: https://reviews.llvm.org/D34027
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305072 91177308-0d34-0410-b5e6-96231b3b80d8
This prevents against assertion errors like PR32659 which occur from a
replacement deleting a node after it's been added to the list argument
of RemoveDeadNodes. The specific failure from PR32659 does not
currently happen, but it is still potentially possible. The underlying
cause is that the callers of the change dfunction builds up a list of
nodes to delete after having moved their uses and it possible that a
move of a later node will cause a previously deleted nodes to be
deleted.
Reviewers: bkramer, spatel, davide
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33731
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305070 91177308-0d34-0410-b5e6-96231b3b80d8
This is a preparatory change to expose the debug compression style to
clang. It requires exposing the enumeration and passing the actual
value through to the backend from the frontend in actual value form
rather than a boolean that selects the GNU style of debug info
compression.
Minor tweak to the ELF Object Writer to use a variable for re-used
values. Add an assertion that debug information format is one of the
two currently known types if debug information is being compressed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305038 91177308-0d34-0410-b5e6-96231b3b80d8
(0) RegAllocPBQP: Since getRawAllocationOrder() may return a collection that includes reserved physical registers, iterate to find an un-reserved physical register.
(1) VirtRegMap: Enforce the invariant: "no reserved physical registers" in assignVirt2Phys(). Previously, this was checked only after the fact in VirtRegRewriter::rewrite.
(2) MachineVerifier: updated the test per MatzeB's review.
(3) +testcase
Patch by Nick Johnson<Nicholas.Paul.Johnson@deshawresearch.com>!
Differential Revision: https://reviews.llvm.org/D33947
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305016 91177308-0d34-0410-b5e6-96231b3b80d8
The test diff for PowerPC shows we can better optimize if this case is one block.
For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed
cheap and SDAG can't optimize across blocks.
Instead of this:
_cmp_eq8:
movq (%rdi), %rax
cmpq (%rsi), %rax
je LBB23_1
## BB#2: ## %res_block
movl $1, %ecx
jmp LBB23_3
LBB23_1:
xorl %ecx, %ecx
LBB23_3: ## %endblock
xorl %eax, %eax
testl %ecx, %ecx
sete %al
retq
We get this:
cmp_eq8:
movq (%rdi), %rcx
xorl %eax, %eax
cmpq (%rsi), %rcx
sete %al
retq
And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall().
If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable
CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we
can lower the IR produced here optimally.
Differential Revision: https://reviews.llvm.org/D34005
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304987 91177308-0d34-0410-b5e6-96231b3b80d8
When considering merging stores values are the results of loads only
consider stores whose values come from loads from the same base.
This fixes much of the longer compile times in PR33330.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304934 91177308-0d34-0410-b5e6-96231b3b80d8
This could be viewed as another shortcoming of the DAGCombiner:
when both operands of a compare are zexted from the same source
type, we should be able to compare the original types.
The effect on PowerPC perf is likely unnoticeable, but there's a
visible regression for x86 if we feed the suboptimal IR for memcmp
expansion to the DAG:
_cmp_eq4_zexted_to_i64:
movl (%rdi), %ecx
movl (%rsi), %edx
xorl %eax, %eax
cmpq %rdx, %rcx
sete %al
_cmp_eq4_better:
movl (%rdi), %ecx
xorl %eax, %eax
cmpl (%rsi), %ecx
sete %al
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304923 91177308-0d34-0410-b5e6-96231b3b80d8
In the special (but also the likely common) case, we can avoid
the multi-block complexity of the general algorithm, so moving
this part off on its own will make it re-usable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304908 91177308-0d34-0410-b5e6-96231b3b80d8
This creates a new library called BinaryFormat that has all of
the headers from llvm/Support containing structure and layout
definitions for various types of binary formats like dwarf, coff,
elf, etc as well as the code for identifying a file from its
magic.
Differential Revision: https://reviews.llvm.org/D33843
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304864 91177308-0d34-0410-b5e6-96231b3b80d8
I'd like to enable CGP memcmp expansion for x86, but the output from CGP would regress the
special cases (memcmp(x,y,N) != 0 for N=1,2,4,8,16,32 bytes) that we already handle.
I'm not sure if we'll actually be able to produce the optimal code given the block-at-a-time
limitation in the DAG. We might have to just avoid those special-cases here in CGP. But
regardless of that, I think this is a win for the more general cases.
http://rise4fun.com/Alive/cbQ
Differential Revision: https://reviews.llvm.org/D33963
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304849 91177308-0d34-0410-b5e6-96231b3b80d8
- Add -x <language> option to switch between IR and MIR inputs.
- Change MIR parser to read from stdin when filename is '-'.
- Add a simple mir roundtrip test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304825 91177308-0d34-0410-b5e6-96231b3b80d8
CodeGen uses MO_ExternalSymbol to represent the inline assembly strings.
Empty strings for symbol names appear to be invalid. For now just
special case the output code to avoid hitting an `assert()` in
`printLLVMNameWithoutPrefix()`.
This fixes https://llvm.org/PR33317
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304815 91177308-0d34-0410-b5e6-96231b3b80d8
I did this a long time ago with a janky python script, but now
clang-format has built-in support for this. I fed clang-format every
line with a #include and let it re-sort things according to the precise
LLVM rules for include ordering baked into clang-format these days.
I've reverted a number of files where the results of sorting includes
isn't healthy. Either places where we have legacy code relying on
particular include ordering (where possible, I'll fix these separately)
or where we have particular formatting around #include lines that
I didn't want to disturb in this patch.
This patch is *entirely* mechanical. If you get merge conflicts or
anything, just ignore the changes in this patch and run clang-format
over your #include lines in the files.
Sorry for any noise here, but it is important to keep these things
stable. I was seeing an increasing number of patches with irrelevant
re-ordering of #include lines because clang-format was used. This patch
at least isolates that churn, makes it easy to skip when resolving
conflicts, and gets us to a clean baseline (again).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304787 91177308-0d34-0410-b5e6-96231b3b80d8
If -simplify-mir option is passed then MIRPrinter will not print such fields.
This change also required some lit test cases in CodeGen directory to be changed.
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D32304
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304779 91177308-0d34-0410-b5e6-96231b3b80d8
When parsing .mir files immediately construct the MachineFunctions and
put them into MachineModuleInfo.
This allows us to get rid of the delayed construction (and delayed error
reporting) through the MachineFunctionInitialzier interface.
Differential Revision: https://reviews.llvm.org/D33809
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304758 91177308-0d34-0410-b5e6-96231b3b80d8
- Move ISel (and pre-isel) pass construction into TargetPassConfig
- Extract AsmPrinter construction into a helper function
Putting the ISel code into TargetPassConfig seems a lot more natural and
both changes together make make it easier to build custom pipelines
involving .mir in an upcoming commit. This moves MachineModuleInfo to an
earlier place in the pass pipeline which shouldn't have any effect.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304754 91177308-0d34-0410-b5e6-96231b3b80d8
If a tied source operand was undef, it would be replaced but not
update the other tied operand, which would end up using different
virtual registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304747 91177308-0d34-0410-b5e6-96231b3b80d8
Running `llc -verify-dom-info` on the attached testcase results in a
crash in the verifier, due to a stale dominator tree.
i.e.
DominatorTree is not up to date!
Computed:
=============================--------------------------------
Inorder Dominator Tree:
[1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,7}
[2] %lor.lhs.false.i61.i.i.i {1,2}
[2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,6}
[3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
Actual:
=============================--------------------------------
Inorder Dominator Tree:
[1] %safe_mod_func_uint8_t_u_u.exit.i.i.i {0,9}
[2] %lor.lhs.false.i61.i.i.i {1,2}
[2] %safe_mod_func_int8_t_s_s.exit.i.i.i {3,8}
[3] %safe_div_func_int64_t_s_s.exit66.i.i.i {4,5}
[3] %safe_mod_func_int8_t_s_s.exit.i.i.i.lor.lhs.false.i61.i.i.i_crit_edge {6,7}
This is because in `SelectionDAGIsel` we split critical edges without
updating the corresponding dominator for the function (and we claim
in `MachineFunctionPass::getAnalysisUsage()` that the domtree is preserved).
We could either stop preserving the domtree in `getAnalysisUsage`
or tell `splitCriticalEdge()` to update it.
As the second option is easy to implement, that's the one I chose.
Differential Revision: https://reviews.llvm.org/D33800
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304742 91177308-0d34-0410-b5e6-96231b3b80d8
This ensures that we can emit the ObjC Image Info structure on COFF and
ELF as well. The frontend already would attempt to emit this
information but would get dropped when generating assembly or an object
file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304736 91177308-0d34-0410-b5e6-96231b3b80d8
Other calls to DAGCombiner::*PromoteOperand check the result, but here it could cause an assertion in getNode.
Falling back to any extend in this case instead of failing outright seems correct to me.
No test case because:
The failure was triggered by an out of tree backend. In order to trigger it, a backend would need to overload
TargetLowering::IsDesirableToPromoteOp to return true for a type for which ISD::SIGN_EXTEND_INREG is marked
illegal. In tree, only X86 overloads and sometimes returns true for MVT::i16 yet it marks
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::i16 , Legal);.
Patch by Jacob Young!
Differential Revision: https://reviews.llvm.org/D33633
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304723 91177308-0d34-0410-b5e6-96231b3b80d8
This patch provides a means to specify section-names for global variables,
functions and static variables, using #pragma directives.
This feature is only defined to work sensibly for ELF targets.
One can specify section names as:
#pragma clang section bss="myBSS" data="myData" rodata="myRodata" text="myText"
One can "unspecify" a section name with empty string e.g.
#pragma clang section bss="" data="" text="" rodata=""
Reviewers: Roger Ferrer, Jonathan Roelofs, Reid Kleckner
Differential Revision: https://reviews.llvm.org/D33413
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304704 91177308-0d34-0410-b5e6-96231b3b80d8
Adjust code to look more like the code in LivePhysRegs and port over the
fix for LivePhysRegs from r304001 and adapt to the new CSR management in
MachineRegisterInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304622 91177308-0d34-0410-b5e6-96231b3b80d8
We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag. Fix a few cases we'd missed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304607 91177308-0d34-0410-b5e6-96231b3b80d8
This pass allows to run the register scavenging independently of
PrologEpilogInserter to allow targeted testing.
Also adds some basic register scavenging tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304606 91177308-0d34-0410-b5e6-96231b3b80d8
Prior to this patch we used to not touch the LiveRegMatrix while doing
live-range splitting. In other words, when live-range splitting was
occurring, the LiveRegMatrix was not reflecting the changes.
This is generally fine because it means the query to the LiveRegMatrix
will be conservately correct. However, when decisions are taken based on
what is going to happen on the interferences (e.g., when we spill a
register and know that it is going to be available for another one), we
might hit an assertion that the color used for the assignment is still
in use.
This patch makes sure the changes on the live-ranges are properly
reflected in the LiveRegMatrix, so the assertions don't break.
An alternative could have been to remove the assertion, but it would
make the invariants of the code and the general reasoning more
complicated in my opnion.
http://llvm.org/PR33057
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304603 91177308-0d34-0410-b5e6-96231b3b80d8
Use the initializeXXX method to initialize the RABasic pass in the
pipeline. This enables us to take advantage of the .mir infrastructure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304602 91177308-0d34-0410-b5e6-96231b3b80d8
These parts do not depend on any PrologEpilogInserter logic and
therefore better fits RegisterScaveging.cpp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304596 91177308-0d34-0410-b5e6-96231b3b80d8
While doing so, clarify the comments and update them to reflect current reality.
Note: I'm going to let this sit for a week or so before adding further verification. I want to give this time to cycle through bots and merge it into our downstream tree before pushing this further.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304565 91177308-0d34-0410-b5e6-96231b3b80d8
This initial patch doesn't actually do much useful. It's just to show where the new code goes. Once this is in, I'll extend the verification logic to check more useful properties.
For those curious, the more complicated version of this patch already found one very suspicious thing.
Differential Revision: https://reviews.llvm.org/D33819
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304564 91177308-0d34-0410-b5e6-96231b3b80d8
The recursive implementation of findNonImmUse may overflow stack
on extremely long use chains. This patch replaces it with an equivalent
iterative implementation.
Reviewed By: bogner
Differential Revision: https://reviews.llvm.org/D33775
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304522 91177308-0d34-0410-b5e6-96231b3b80d8
The AArch64 backend marks calls that involve aggregate function
arguments as having an implicit def of SP. We already have the same
workaround in LiveDebugValues and in DbgValueHistoryCalculator for SP
clobbers in register masks. This adds register defs to the list.
Fixes rdar://problem/30361929 and Swift SR-3851.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304471 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is a problem uncovered by stage2 testing. ADDCARRY end up being generated on target that do not support it.
The patch that introduced the problem has other patches layed on top of it, so we want to fix the issue rather than revert it to avoid creating a lor of churn.
A regression test will be added shortly, but this is committed as this in order to get the build back to green promptly.
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33770
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304409 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is a continuation of the work started in D29872 . Passing the carry down as a value rather than as a glue allows for further optimizations. Introducing setcccarry makes the use of addc/subc unecessary and we can start the removal process.
This patch only introduce the optimization strictly required to get the same level of optimization as was available before nothing more.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D33374
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304404 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This pattern is no very useful per se, but it exposes optimization for toehr patterns that wouldn't kick in otherwize. It's very common and worth optimizing for.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32756
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304402 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.
Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb
Reviewed By: MatzeB, andreadb
Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32563
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304371 91177308-0d34-0410-b5e6-96231b3b80d8
We should have a single call site entry with no landing pad. This
indicates that no EH action should be taken and the unwinder should
unwind to the next frame.
We currently don't recognize __gxx_personality_seh0 as a known
personality, so we forcibly emit a table, and that table was wrong. This
was filed as PR33220. Now we emit a correct table for that personality.
The next step is to recognize that we can completely skip the table for
this personality.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304363 91177308-0d34-0410-b5e6-96231b3b80d8
It seems not all of our bots have a std::vector::erase() taking a
const_iterator (even though that seems to be part of C++11) attempt to
workaround.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304349 91177308-0d34-0410-b5e6-96231b3b80d8
After transforming FP to ST registers:
- Do not add the ST register to the livein lists, they are reserved so
we do not need to track their liveness.
- Remove the FP registers from the livein lists, they don't have defs or
uses anymore and so are not live.
- (The setKillFlags() call is moved to an earlier place as it relies on
the FP registers still being present in the livein list.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304342 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If we attempt to unfold an SUnit in ScheduleDAG that results in
finding an already scheduled load, we must should abort the
unfold as it will not improve scheduling.
This fixes PR32610.
Reviewers: jmolloy, sunfish, bogner, spatel
Subscribers: llvm-commits, MatzeB
Differential Revision: https://reviews.llvm.org/D32911
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304321 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a callback to the LLVMTargetMachine that lets target indicate
that they do not pass the machine verifier checks in all cases yet.
This is intended to be a temporary measure while the targets are fixed
allowing us to enable the machine verifier by default with
EXPENSIVE_CHECKS enabled!
Differential Revision: https://reviews.llvm.org/D33696
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304320 91177308-0d34-0410-b5e6-96231b3b80d8
This patch does an inline expansion of memcmp.
It changes the memcmp library call into an inline expansion when the size is
known at compile time and is under a target specified threshold.
This expansion is implemented in CodeGenPrepare and expands into straight line
code. The target specifies a maximum load size and the expansion works by using
this size to load the two sources, compare, and exit early if a difference is
found. It also has a special case when the memcmp result is used in a compare
to zero equality.
Differential Revision: https://reviews.llvm.org/D28637
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304313 91177308-0d34-0410-b5e6-96231b3b80d8
Correct references to alignment of store which may be deleted in a
previous iteration of merge. Instead use first store that would be
merged.
Corrects pr33172's use-after-poison caught by ASan.
Reviewers: spatel, hfinkel, RKSimon
Reviewed By: RKSimon
Subscribers: thegameg, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33686
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304299 91177308-0d34-0410-b5e6-96231b3b80d8
This was introduced a long time ago in r86583 when regmask operands
didn't exist. Nowadays the behavior hurts more than it helps. This
removes it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304254 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
AntiDepBreaker intends to add all live-outs, including the implicit
CSRs, in StartBlock. r299124 was done without understanding that
intention.
Now with the live-ins propagated correctly (D32464), we can revert this change.
Reviewers: MatzeB, qcolombet
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D33697
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304251 91177308-0d34-0410-b5e6-96231b3b80d8
TargetPassConfig is not useful for targets that do not use the CodeGen
library, so we may just as well store a pointer to an
LLVMTargetMachine instead of just to a TargetMachine.
While at it, also change the constructor to take a reference instead of a
pointer as the TM must not be nullptr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304247 91177308-0d34-0410-b5e6-96231b3b80d8
There is no guarantee that the first use of a constant that is traversed
is actually the first in the related basic block. Thus, if we use that
as the insertion point we may end up with definitions that don't
dominate there use.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304244 91177308-0d34-0410-b5e6-96231b3b80d8
This code was compensating for FPOWI defaulting to Legal and many targets not changing it to Expand. This was fixed in r304215 to default to Expand so this special handling should no longer be necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304221 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Currently FPOWI defaults to Legal and LegalizeDAG.cpp turns Legal into Expand for this opcode because Legal is a "lie".
This patch changes the default for this opcode to Expand and removes the hack from LegalizeDAG.cpp. It also removes all the code in the targets that set this opcode to Expand themselves since they can just rely on the default.
Reviewers: spatel, RKSimon, efriedma
Reviewed By: RKSimon
Subscribers: jfb, dschuff, sbc100, jgravelle-google, nemanjai, javed.absar, andrew.w.kaylor, llvm-commits
Differential Revision: https://reviews.llvm.org/D33530
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304215 91177308-0d34-0410-b5e6-96231b3b80d8
This is really a workaround for ThinLTO in particular - since it can
import partial CUs that may end up looking very similar/the same as
the same partial import in another ThinLTO compile.
An alternative fix would be to change the DICompileUnit metadata to
include a "primary file" or the like - and when importing for ThinLTO
set the primary file to the name of the DICompileUnit that is being
imported into. This involves changing the schema and would reduce the
excessive uniqueness in the hash that this change creates - allowing
diagnosing of more duplicate CUs than will be caught with this change.
But duplicate CUs can still be caught in non-ThinLTO builds & are mostly
a nuisance rather than a particularly deliberate/effective tool for
finding broken code. (arguably the hash could always include the dwo
file and nothing in fission would break, I think..)
Reapply of r304119 after adding a triple to the test and moving it
to the X86 directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304130 91177308-0d34-0410-b5e6-96231b3b80d8
When the only use of a CU is for a subprogram that's only emitted into
the using CU (to avoid cross-CU references in DWO files), avoid creating
that CU at all.
Reapply of r304111 after adding a triple to the test and moving it
to the X86 directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304129 91177308-0d34-0410-b5e6-96231b3b80d8
The reverted change introdued assertions ala:
"MachineBasicBlock::succ_iterator
llvm::MachineBasicBlock::removeSuccessor(succ_iterator, bool): Assertion
`I != Successors.end() && "Not a current successor!"'
Mikael, the original committer, wrote me that he is working on a fix, but that
it likely will take some time to get this resolved. As this bug is one of the
last two issues that keep the AOSP buildbot from turning green, I revert the
original commit r302876.
I am looking forward to see this recommitted after the assertion has been
resolved.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304128 91177308-0d34-0410-b5e6-96231b3b80d8
This is really a workaround for ThinLTO in particular - since it can
import partial CUs that may end up looking very similar/the same as
the same partial import in another ThinLTO compile.
An alternative fix would be to change the DICompileUnit metadata to
include a "primary file" or the like - and when importing for ThinLTO
set the primary file to the name of the DICompileUnit that is being
imported into. This involves changing the schema and would reduce the
excessive uniqueness in the hash that this change creates - allowing
diagnosing of more duplicate CUs than will be caught with this change.
But duplicate CUs can still be caught in non-ThinLTO builds & are mostly
a nuisance rather than a particularly deliberate/effective tool for
finding broken code. (arguably the hash could always include the dwo
file and nothing in fission would break, I think..)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304119 91177308-0d34-0410-b5e6-96231b3b80d8
When the only use of a CU is for a subprogram that's only emitted into
the using CU (to avoid cross-CU references in DWO files), avoid creating
that CU at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304111 91177308-0d34-0410-b5e6-96231b3b80d8
If we have (extract_subvector(load wide vector)) with no other users,
that can just be (load narrow vector). This is intentionally conservative.
Follow-ups may loosen the one-use constraint to account for the extract cost
or just remove the one-use check.
The memop chain updating is based on code that already exists multiple times
in x86 lowering, so that should be pulled into a helper function as a follow-up.
Background: this is a potential improvement noticed via regressions caused by
making x86's peekThroughBitcasts() not loop on consecutive bitcasts (see
comments in D33137).
Differential Revision: https://reviews.llvm.org/D33578
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304072 91177308-0d34-0410-b5e6-96231b3b80d8
Rewrite fixupKills() to use the LivePhysRegs class. Simplifies the code
and fixes a bug where the CSR registers in return blocks where missed
leading to invalid kill flags. Also remove the unnecessary rule that we
wouldn't set kill flags on tied operands.
No tests as I have an upcoming commit improving MachineVerifier checks
to catch these cases in multiple existing lit tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304055 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r299287 plus clean-ups.
The localizer pass is a helper pass that could be run at O0 in the GISel
pipeline to work around the deficiency of the fast register allocator.
It basically shortens the live-ranges of the constants so that the
allocator does not spill all over the place.
Long term fix would be to make the greedy allocator fast.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304051 91177308-0d34-0410-b5e6-96231b3b80d8
One case in BranchRelaxation did not compute liveins after creating a
new block. This is catched by existing tests with an upcoming commit
that will improve MachineVerifier checking of livein lists.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304049 91177308-0d34-0410-b5e6-96231b3b80d8
Consistent with GCC and addresses a shortcoming with ThinLTO where many
imported CUs may end up being empty (because the functions imported from
them either ended up not being used (and were then discarded, since
they're imported as available_externally) or optimized away entirely).
Test cases previously testing empty CUs (either intentionally, or
because they didn't need anything more complicated) had a trivial 'int'
or similar basic type added to their retained types list.
This is a first order approximation - a deeper implementation could do
things like:
1) Be more lazy about construction of the CU - for example if two CUs
containing a single identical retained type are linked together, with
this change one of the two CUs will be produced but empty (since a
duplicate type won't be produced).
2) Go further and invert all the CU links the same way the subprogram
link is inverted - keep named CU lists of retained types, macros, etc,
and have those link back to the CU. Then if they're emitted, the CU is
emitted, but never otherwise - this would allow the metadata itself to
be dropped earlier too, though it seems unlikely that's an important
optimization as there shouldn't be many CUs relative to the number of
other entities.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304020 91177308-0d34-0410-b5e6-96231b3b80d8
This produced 'strange' DWARF anyway - the CU would have no ranges (or
at least not a range including the inlined code) nor any subprogram or
inlined_subroutine - yet the line table would have entries for these
instructions.
(this actually becomes more relevant with changes coming after this,
where a CU without any contents will be omitted entirely - so there
would be no line table to put this on anyway)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304004 91177308-0d34-0410-b5e6-96231b3b80d8
Re-commit r303938 and r303954 with a fix for addLiveIns(): the internal
addPristines() function must be called on an empty set or it may
accidentally reset saved registers.
- addLiveOutsNoPristines() needs to add callee saved registers that are
actually saved and restored somewhere to the set (they are not
pristine).
- Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines().
This fixes the problem from D32156.
Differential Revision: https://reviews.llvm.org/D32464
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304001 91177308-0d34-0410-b5e6-96231b3b80d8
In the best case:
extract (binop (concat X1, X2), (concat Y1, Y2)), N --> binop XN, YN
...we kill all of the extract/concat and just have narrow binops remaining.
If only one of the binop operands is amenable, this transform is still
worthwhile because we kill some of the extract/concat.
Optional bitcasting makes the code more complicated, but there doesn't
seem to be a way to avoid that.
The TODO about extending to more than bitwise logic is there because we really
will regress several x86 tests including madd, psad, and even a plain
integer-multiply-by-2 or shift-left-by-1. I don't think there's anything
fundamentally wrong with this patch that would cause those regressions; those
folds are just missing or brittle.
If we extend to more binops, I found that this patch will fire on at least one
non-x86 regression test. There's an ARM NEON test in
test/CodeGen/ARM/coalesce-subregs.ll with a pattern like:
t5: v2f32 = vector_shuffle<0,3> t2, t4
t6: v1i64 = bitcast t5
t8: v1i64 = BUILD_VECTOR Constant:i64<0>
t9: v2i64 = concat_vectors t6, t8
t10: v4f32 = bitcast t9
t12: v4f32 = fmul t11, t10
t13: v2i64 = bitcast t12
t16: v1i64 = extract_subvector t13, Constant:i32<0>
There was no functional change in the codegen from this transform from what I
could see though.
For the x86 test changes:
1. PR32790() is the closest call. We don't reduce the AVX1 instruction count in that case,
but we improve throughput. Also, on a core like Jaguar that double-pumps 256-bit ops,
there's an unseen win because two 128-bit ops have the same cost as the wider 256-bit op.
SSE/AVX2/AXV512 are not affected which is expected because only AVX1 has the extract/concat
ops to match the pattern.
2. do_not_use_256bit_op() is the best case. Everyone wins by avoiding the concat/extract.
Related bug for IR filed as: https://bugs.llvm.org/show_bug.cgi?id=33026
3. The SSE diffs in vector-trunc-math.ll are just scheduling/RA, so nothing real AFAICT.
4. The AVX1 diffs in vector-tzcnt-256.ll are all the same pattern: we reduced the instruction
count by one in each case by eliminating two insert/extract while adding one narrower logic op.
https://bugs.llvm.org/show_bug.cgi?id=32790
Differential Revision: https://reviews.llvm.org/D33137
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303997 91177308-0d34-0410-b5e6-96231b3b80d8
Currently getOptimalMemOpType returns i32 for large enough sizes without
checking for alignment, leading to poor code generation when misaligned accesses
aren't permitted as we generate a word store then later split it up into byte
stores. This means we inadvertantly go over the MaxStoresPerMemcpy limit and for
memset we splat the memset value into a word then immediately split it up
again.
Fix this by leaving it up to FindOptimalMemOpLowering to figure out which type
to use, but also fix a bug there where it wasn't correctly checking if
misaligned memory accesses are allowed.
Differential Revision: https://reviews.llvm.org/D33442
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303990 91177308-0d34-0410-b5e6-96231b3b80d8
Re-commit r303937 + r303949 as they were not the cause for the build
failures.
We do not track liveness of reserved registers so adding them to the
liveins list in computeLiveIns() was completely unnecessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303970 91177308-0d34-0410-b5e6-96231b3b80d8
Tentatively revert this to see if it fixes the buildbot stage2
breakages.
This reverts commit r303938.
This reverts commit r303954.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303960 91177308-0d34-0410-b5e6-96231b3b80d8
Tentatively revert, suspecting that it caused breakage in stage2
buildbots.
This reverts commit r303949.
This reverts commit r303937.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303955 91177308-0d34-0410-b5e6-96231b3b80d8
We may have situations in which a superregister is reserved and not
added to liveins, so we have to add the subregisters.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303949 91177308-0d34-0410-b5e6-96231b3b80d8
- addLiveOutsNoPristines() needs to add callee saved registers that are
actually saved and restored somewhere to the set (they are not
pristine).
- Cleanup/rewrite the code for addLiveOuts()/addLiveOutsNoPristines().
This fixes the problem from D32156.
Differential Revision: https://reviews.llvm.org/D32464
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303938 91177308-0d34-0410-b5e6-96231b3b80d8
We do not track liveness of reserved registers so adding them to the
liveins list in computeLiveIns() was completely unnecessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303937 91177308-0d34-0410-b5e6-96231b3b80d8
Previously this code was defensive to the situation in which the debug
info scopes would lead to a different subprogram from the subprogram in
the CU's subprogram list (this could've happened with linkonce
functions, etc as per the comment being removed). Since the CU<>SP link
reversal this is no longer possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303933 91177308-0d34-0410-b5e6-96231b3b80d8
Rename the DEBUG_TYPE to match the names of corresponding passes where
it makes sense. Also establish the pattern of simply referencing
DEBUG_TYPE instead of repeating the passname where possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303921 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, every time we wanted to serialize a field list record, we
would create a new copy of FieldListRecordBuilder, which would in turn
create a temporary instance of TypeSerializer, which itself had a
std::vector<> that was about 128K in size. So this 128K allocation was
happening every time. We can re-use the same instance over and over, we
just have to clear its internal hash table and seen records list between
each run. This saves us from the constant re-allocations.
This is worth an ~18.5% speed increase (3.75s -> 3.05s) in my tests.
Differential Revision: https://reviews.llvm.org/D33506
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303919 91177308-0d34-0410-b5e6-96231b3b80d8
Turns out gold doesn't use the DW_AT_GNU_pubnames to decide whether to
parse the rest of the DIEs when building gdb-index. This causes gold to
trip over LLVM's output when there are DW_FORM_ref_addr present.
Gold does use the presence of a debug_gnu_pub{names,types} entry for the
CU to skip parsing the debug_info portion, so make sure that's included
even when empty (technically, when empty there couldn't be any ref_addr
anyway - it only came up when gmlt didn't produce any (even non-empty)
pubnames - but given what that reveals about gold's implementation, this
seems like a good thing to do for consistency).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303894 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is a fix for PR32538. MachineCSE first looks at MO.isDead(), but
if it is not marked dead, MachineCSE still wants to do its own check
to see if it is trivially dead. This check for the trivial case
assumed that physical registers cannot be live out of a block.
Patch by Mattias Eriksson.
Reviewers: qcolombet, jbhateja
Reviewed By: qcolombet, jbhateja
Subscribers: jbhateja, llvm-commits
Differential Revision: https://reviews.llvm.org/D33408
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303731 91177308-0d34-0410-b5e6-96231b3b80d8
C++14 added user-defined literal support for complex numbers so that you can
write something like "complex<double> val = 2i". However, there is an existing
GNU extension supporting this syntax and interpreting the result as a _Complex
type.
This changes parsing so that such literals are interpreted in terms of C++14's
operators if an overload is present but otherwise falls back to the original
GNU extension.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303694 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch makes instruction fusion more aggressive by
* adding artificial edges between the successors of FirstSU and
SecondSU, similar to BaseMemOpClusterMutation::clusterNeighboringMemOps.
* updating PostGenericScheduler::tryCandidate to keep clusters together,
similar to GenericScheduler::tryCandidate.
This change increases the number of AES instruction pairs generated on
Cortex-A57 and Cortex-A72. This doesn't change code at all in
most benchmarks or general code, but we've seen improvement on kernels
using AESE/AESMC and AESD/AESIMC.
Reviewers: evandro, kristof.beyls, t.p.northover, silviu.baranga, atrick, rengolin, MatzeB
Reviewed By: evandro
Subscribers: aemerson, rengolin, MatzeB, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33230
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303618 91177308-0d34-0410-b5e6-96231b3b80d8
MachineInstructions that don't generate any code (such as
IMPLICIT_DEFs) should not generate any debug info either.
Fixes PR33107.
https://bugs.llvm.org/show_bug.cgi?id=33107
This reapplies r303566 without any modifications. The stage2 build
failures persisted even after reverting this patch, and looking back
through history, it looks like these tests are flaky.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303575 91177308-0d34-0410-b5e6-96231b3b80d8
Refactor the strlen optimization code to work for both strlen and wcslen.
This especially helps with programs in the wild where people pass
L"string"s to const std::wstring& function parameters and the wstring
constructor gets inlined.
This also fixes a lingerind API problem/bug in getConstantStringInfo()
where zeroinitializers would always give you an empty string (without a
length) back regardless of the actual length of the initializer which
did not work well in the TrimAtNul==false causing the PR mentioned
below.
Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG
memcpy lowering and may lead to some cases for out-of-bounds
zeroinitializer accesses not getting optimized anymore. So some code
with UB may produce out of bound memory reads now instead of just
producing zeros.
The refactoring "accidentally" fixes http://llvm.org/PR32124
Differential Revision: https://reviews.llvm.org/D32839
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303461 91177308-0d34-0410-b5e6-96231b3b80d8
This was originally reverted because it was a breaking a bunch
of bots and the breakage was not surfacing on Windows. After much
head-scratching this was ultimately traced back to a bug in the
lit test runner related to its pipe handling. Now that the bug
in lit is fixed, Windows correctly reports these test failures,
and as such I have finally (hopefully) fixed all of them in this
patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303446 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
While this makes some case better and some case worse - so it's unclear if it is a worthy combine just by itself - this is a useful canonicalisation.
As per discussion in D32756 .
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32916
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303441 91177308-0d34-0410-b5e6-96231b3b80d8
This is a squash of ~5 reverts of, well, pretty much everything
I did today. Something is seriously broken with lit on Windows
right now, and as a result assertions that fire in tests are
triggering failures. I've been breaking non-Windows bots all
day which has seriously confused me because all my tests have
been passing, and after running lit with -a to view the output
even on successful runs, I find out that the tool is crashing
and yet lit is still reporting it as a success!
At this point I don't even know where to start, so rather than
leave the tree broken for who knows how long, I will get this
back to green, and then once lit is fixed on Windows, hopefully
hopefully fix the remaining set of problems for real.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303409 91177308-0d34-0410-b5e6-96231b3b80d8
pruneSubRegValues() needs to remove subregister ranges starting at
instructions that later get removed by eraseInstrs(). It missed to check
one case in which eraseInstrs() would remove an instruction.
Fixes http://llvm.org/PR32688
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303396 91177308-0d34-0410-b5e6-96231b3b80d8
Right now we have multiple notions of things that represent collections of
types. Most commonly used are TypeDatabase, which is supposed to keep
mappings from TypeIndex to type name when reading a type stream, which
happens when reading PDBs. And also TypeTableBuilder, which is used to
build up a collection of types dynamically which we will later serialize
(i.e. when writing PDBs).
But often you just want to do some operation on a collection of types, and
you may want to do the same operation on any kind of collection. For
example, you might want to merge two TypeTableBuilders or you might want
to merge two type streams that you loaded from various files.
This dichotomy between reading and writing is responsible for a lot of the
existing code duplication and overlapping responsibilities in the existing
CodeView library classes. For example, after building up a
TypeTableBuilder with a bunch of type records, if we want to dump it we
have to re-invent a bunch of extra glue because our dumper takes a
TypeDatabase or a CVTypeArray, which are both incompatible with
TypeTableBuilder.
This patch introduces an abstract base class called TypeCollection which
is shared between the various type collection like things. Wherever we
previously stored a TypeDatabase& in some common class, we now store a
TypeCollection&.
The advantage of this is that all the details of how the collection are
implemented, such as lazy deserialization of partial type streams, is
completely transparent and you can just treat any collection of types the
same regardless of where it came from.
Differential Revision: https://reviews.llvm.org/D33293
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303388 91177308-0d34-0410-b5e6-96231b3b80d8
This also reverts follow-ups r303292 and r303298.
It broke some Chromium tests under MSan, and apparently also internal
tests at Google.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303369 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Implements PR889
Removing the virtual table pointer from Value saves 1% of RSS when doing
LTO of llc on Linux. The impact on time was positive, but too noisy to
conclusively say that performance improved. Here is a link to the
spreadsheet with the original data:
https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing
This change makes it invalid to directly delete a Value, User, or
Instruction pointer. Instead, such code can be rewritten to a null check
and a call Value::deleteValue(). Value objects tend to have their
lifetimes managed through iplist, so for the most part, this isn't a big
deal. However, there are some places where LLVM deletes values, and
those places had to be migrated to deleteValue. I have also created
llvm::unique_value, which has a custom deleter, so it can be used in
place of std::unique_ptr<Value>.
I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which
derives from User outside of lib/IR. Code in IR cannot include MemorySSA
headers or call the MemoryAccess object destructors without introducing
a circular dependency, so we need some level of indirection.
Unfortunately, no class derived from User may have any virtual methods,
because adding a virtual method would break User::getHungOffOperands(),
which assumes that it can find the use list immediately prior to the
User object. I've added a static_assert to the appropriate OperandTraits
templates to help people avoid this trap.
Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv
Reviewed By: chandlerc
Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D31261
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303362 91177308-0d34-0410-b5e6-96231b3b80d8
This provides a new way to access the TargetMachine through
TargetPassConfig, as a dependency.
The patterns replaced here are:
* Passes handling a null TargetMachine call
`getAnalysisIfAvailable<TargetPassConfig>`.
* Passes not handling a null TargetMachine
`addRequired<TargetPassConfig>` and call
`getAnalysis<TargetPassConfig>`.
* MachineFunctionPasses now use MF.getTarget().
* Remove all the TargetMachine constructors.
* Remove INITIALIZE_TM_PASS.
This fixes a crash when running `llc -start-before prologepilog`.
PEI needs StackProtector, which gets constructed without a TargetMachine
by the pass manager. The StackProtector pass doesn't handle the case
where there is no TargetMachine, so it segfaults.
Related to PR30324.
Differential Revision: https://reviews.llvm.org/D33222
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303360 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
As of this patch, 1018 out of 3938 rules are currently imported.
Depends on D32275
Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar
Reviewed By: qcolombet
Subscribers: dberris, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D32278
The previous commit failed on test-suite/Bitcode/simd_ops/AArch64_halide_runtime.bc
because isImmOperandEqual() assumed MO was a register operand and that's not
always true.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303341 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
There are several places in the codebase that try to calculate a maximum value in a Statistic object. We currently do this in one of two ways:
MaxNumFoo = std::max(MaxNumFoo, NumFoo);
or
MaxNumFoo = (MaxNumFoo > NumFoo) ? MaxNumFoo : NumFoo;
The first version reads from MaxNumFoo one time and uncontionally rwrites to it. The second version possibly reads it twice depending on the result of the first compare. But we have no way of knowing if the value was changed by another thread between the reads and the writes.
This patch adds a method to the Statistic object that can ensure that we only store if our value is the max and the previous max didn't change after we read it. If it changed we'll recheck if our value should still be the max or not and try again.
This spawned from an audit I'm trying to do of all places we uses the implicit conversion to unsigned on the Statistics objects. See my previous thread on llvm-dev https://groups.google.com/forum/#!topic/llvm-dev/yfvxiorKrDQ
Reviewers: dberlin, chandlerc, hfinkel, dblaikie
Reviewed By: chandlerc
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D33301
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303318 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Moving LiveRangeShrink to x86 as this pass is mostly useful for archtectures with great register pressure.
Reviewers: MatzeB, qcolombet
Reviewed By: qcolombet
Subscribers: jholewinski, jyknight, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33294
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303292 91177308-0d34-0410-b5e6-96231b3b80d8
Make sure IRTranslator->MachineIRBuilder->DebugLoc doesn't
outlive the DILocation. Clear it at the end of
IRTranslator::runOnMachineFunction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303277 91177308-0d34-0410-b5e6-96231b3b80d8
There is often a lot of boilerplate code required to visit a type
record or type stream. The #1 use case is that you have a sequence
of bytes that represent one or more records, and you want to
deserialize each one, switch on it, and call a callback with the
deserialized record that the user can examine. Currently this
requires at least 6 lines of code:
codeview::TypeVisitorCallbackPipeline Pipeline;
Pipeline.addCallbackToPipeline(Deserializer);
Pipeline.addCallbackToPipeline(MyCallbacks);
codeview::CVTypeVisitor Visitor(Pipeline);
consumeError(Visitor.visitTypeRecord(Record));
With this patch, it becomes one line of code:
consumeError(codeview::visitTypeRecord(Record, MyCallbacks));
This is done by having the deserialization happen internally inside
of the visitTypeRecord function. Since this is occasionally not
desirable, the function provides a 3rd parameter that can be used
to change this behavior.
Hopefully this can significantly reduce the barrier to entry
to using the visitation infrastructure.
Differential Revision: https://reviews.llvm.org/D33245
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303271 91177308-0d34-0410-b5e6-96231b3b80d8
Don't allow -optimize-regalloc=false with -regalloc given for anything other
than 'fast'. The other register allocators depend on the supporting passes
added by addOptimizedRegAlloc().
Reviewers: Quentin Colombet, Matthias Braun
https://reviews.llvm.org/D33181
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303238 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In SelectionDAG, when a store is immediately chained to another store
to the same address, elide the first store as it has no observable
effects. This is causes small improvements dealing with intrinsics
lowered to stores.
Test notes:
* Many testcases overwrite store addresses multiple times and needed
minor changes, mainly making stores volatile to prevent the
optimization from optimizing the test away.
* Many X86 test cases optimized out instructions associated with
associated with va_start.
* Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has
dependencies to check and can probably be removed and potentially
replaced with another test.
Reviewers: rnk, john.brawn
Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33206
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303198 91177308-0d34-0410-b5e6-96231b3b80d8
ShrinkWrapping is a performance optimization that can safely be skipped,
so we can add `if (!skipFunction()) return;`
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303197 91177308-0d34-0410-b5e6-96231b3b80d8
This function gives the wrong answer on some non-ELF platforms in some
cases. The function that does the right thing lives in Mangler.h. To try to
discourage people from using this function, give it a different name.
Differential Revision: https://reviews.llvm.org/D33162
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303134 91177308-0d34-0410-b5e6-96231b3b80d8
Shrink-wrapping uses post-dominators to find a restore point that
post-dominates all the uses of CSR / stack.
The way dominator trees are modeled in LLVM today is that unreachable
blocks are not present in a generic dominator tree, so, an unreachable node is
dominated by anything: include/llvm/Support/GenericDomTree.h:467.
Since for post-dominators, a no-return block is considered
"unreachable", calling findNearestCommonDominator on an unreachable node
A and a non-unreachable node B, will return B, which can be false. If we
find such node, we bail out since there is no good restore point
available.
rdar://problem/30186931
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303130 91177308-0d34-0410-b5e6-96231b3b80d8
At O3 we are more willing to increase size if we believe it will improve
performance. The current threshold for tail-duplication of 2 instructions is
conservative, and can be relaxed at O3.
Benchmark results:
llvm test-suite:
6% improvement in aha, due to duplication of loop latch
3% improvement in hexxagon
2% slowdown in lpbench. Seems related, but couldn't completely diagnose.
Internal google benchmark:
Produces 4% improvement on internal google protocol buffer serialization
benchmarks.
Differential-Revision: https://reviews.llvm.org/D32324
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303084 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional).
CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0).
Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation.
Differential Revision: https://reviews.llvm.org/D32487
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303050 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We were asserting in RegisterBankInfo if RBI.copyCost() returns
UINT_MAX. This is OK for RegBankSelect::Mode::Fast since we only
try one instruction mapping and can't recover from this, but for
RegBankSelect::Mode::Greedy we will be considering multiple
instruction mappings, so we can recover if we see a UNIT_MAX copy
cost.
The copy cost for one pair of register banks in the AMDGPU backend
will be UNIT_MAX, so this patch will prevent AMDGPU tests from
breaking.
Reviewers: ab, qcolombet, t.p.northover, dsanders
Reviewed By: qcolombet
Subscribers: tpr, llvm-commits
Differential Revision: https://reviews.llvm.org/D33144
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303043 91177308-0d34-0410-b5e6-96231b3b80d8
- MIRYamlMapping: Default value provided for fields which have optional
mappings. Implemented == operators for required classes. When a field's value is
same as default value specified YAML IO class will not print it.
- MIRPrinter: Above mentioned behaviour is not on by default. If -simplify-mir
option not specified, then make yaml::Output to print fields with default values
too.
Differential Revision: https://reviews.llvm.org/D32304
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302984 91177308-0d34-0410-b5e6-96231b3b80d8
We end up dereferencing the end iterator here when the Aspect doesn't exist in the DefaultAction map.
Change the API to return Optional<LLT> and return None when not found.
Also update the callers to handle the None case
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302963 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: LiveRangeShrink pass moves instruction right after the definition with the same BB if the instruction and its operands all have more than one use. This pass is inexpensive and guarantees optimal live-range within BB.
Reviewers: davidxl, wmi, hfinkel, MatzeB, andreadb
Reviewed By: MatzeB, andreadb
Subscribers: hiraditya, jyknight, sanjoy, skatkov, gberry, jholewinski, qcolombet, javed.absar, krytarowski, atrick, spatel, RKSimon, andreadb, MatzeB, mehdi_amini, mgorny, efriedma, davide, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D32563
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302938 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Eli pointed out that it's unsafe to combine the shifts to ISD::SHL etc.,
because those are not defined for b > sizeof(a) * 8, even after some of
the combiners run.
However, PPCISD::SHL defines that behavior (as the instructions themselves).
Move the combination to the backend.
The tests in shift_mask.ll still pass.
Reviewers: echristo, hfinkel, efriedma, iteratee
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D33076
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302937 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds min/max population count, leading/trailing zero/one bit counting methods.
The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give.
Differential Revision: https://reviews.llvm.org/D32931
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302925 91177308-0d34-0410-b5e6-96231b3b80d8
CodeViewDebug sets Asm to nullptr to disable debug info generation. You
can get a .ll file like no-cus.ll from 'clang -gcodeview -g0', which
happens in the ubsan test suite.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302923 91177308-0d34-0410-b5e6-96231b3b80d8
Llvm-stress discovered that a COPY may end up in ExpandPostRA::LowerCopy()
with an undef source operand. It is not possible for the target to handle
this, as this flag is not passed to TII->copyPhysReg().
This patch solves this by treating such a COPY as an identity COPY.
Review: Matthias Braun
https://reviews.llvm.org/D32892
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302877 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Instead of using RemoveExtraEdges (which uses analyzeBranch, which cannot
always be trusted) at the end to fixup the CFG we keep the CFG updated as
we go along and remove or add branches and merge blocks.
This way we won't have any problems if the involved MBBs contain
unanalyzable instructions.
This fixes PR32721.
In that case we had a triangle
EBB
| \
| |
| TBB
| /
FBB
where FBB didn't have any successors at all since it ended with an
unconditional return. Then TBB and FBB were be merged into EBB, but EBB
would still keep its successors, and the use of analyzeBranch and
CorrectExtraCFGEdges wouldn't help to remove them since the return
instruction is not analyzable (at least not on ARM).
Reviewers: kparzysz, iteratee, MatzeB
Reviewed By: iteratee
Subscribers: aemerson, rengolin, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33037
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302876 91177308-0d34-0410-b5e6-96231b3b80d8