------------------------------------------------------------------------
r318289 | jdevlieghere | 2017-11-15 02:57:05 -0800 (Wed, 15 Nov 2017) | 14 lines
[DebugInfo] Fix potential CU mismatch for SubprogramScopeDIEs.
In constructAbstractSubprogramScopeDIE there can be a potential mismatch
between `this` and the CU of ContextDIE when a scope is shared between
two DISubprograms belonging to a different CU. In that case, `this` is
the CU that was specified in the IR, but the CU of ContextDIE is that of
the first subprogram that was emitted. This patch fixes the mismatch by
looking up the CU of ContextDIE, and switching to use that.
This fixes PR35212 (https://bugs.llvm.org/show_bug.cgi?id=35212)
Patch by Philip Craig!
Differential revision: https://reviews.llvm.org/D39981
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@318542 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r311951 | adrian | 2017-08-28 16:07:43 -0700 (Mon, 28 Aug 2017) | 6 lines
Fix a logic error in DwarfExpression::addMachineReg()
This fixes PR34323 and thus splitting undescribable registers into
smaller, describable sub-registers.
https://bugs.llvm.org/show_bug.cgi?id=34323
------------------------------------------------------------------------
------------------------------------------------------------------------
r312038 | joerg | 2017-08-29 14:18:07 -0700 (Tue, 29 Aug 2017) | 2 lines
Simplify test case, so that it works for both trunk and release-5.0.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314567 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r312348 | matze | 2017-09-01 11:36:26 -0700 (Fri, 01 Sep 2017) | 39 lines
LiveIntervalAnalysis: Fix alias regunit reserved definition
A register in CodeGen can be marked as reserved: In that case we
consider the register always live and do not use (or rather ignore)
kill/dead/undef operand flags.
LiveIntervalAnalysis however tracks liveness per register unit (not per
register). We already needed adjustments for this in r292871 to deal
with super/sub registers. However I did not look at aliased register
there. Looking at ARM:
FPSCR (regunits FPSCR, FPSCR~FPSCR_NZCV) aliases with FPSCR_NZCV
(regunits FPSCR_NZCV, FPSCR~FPSCR_NZCV) hence they share a register unit
(FPSCR~FPSCR_NZCV) that represents the aliased parts of the registers.
This shared register unit was previously considered non-reserved,
however given that we uses of the reserved FPSCR potentially violate
some rules (like uses without defs) we should make FPSCR~FPSCR_NZCV
reserved too and stop tracking liveness for it.
This patch:
- Defines a register unit as reserved when: At least for one root
register, the root register and all its super registers are reserved.
- Adjust LiveIntervals::computeRegUnitRange() for new reserved
definition.
- Add MachineRegisterInfo::isReservedRegUnit() to have a canonical way
of testing.
- Stop computing LiveRanges for reserved register units in HMEditor even
with UpdateFlags enabled.
- Skip verification of uses of reserved reg units in the machine
verifier (this usually didn't happen because there would be no cached
liverange but there is no guarantee for that and I would run into this
case before the HMEditor tweak, so may as well fix the verifier too).
Note that this should only affect ARMs FPSCR/FPSCR_NZCV registers today;
aliased registers are rarely used, the only other cases are hexagons
P0-P3/P3_0 and C8/USR pairs which are not mixing reserved/non-reserved
registers in an alias.
Differential Revision: https://reviews.llvm.org/D37356
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@314565 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r312022 | hans | 2017-08-29 11:41:00 -0700 (Tue, 29 Aug 2017) | 10 lines
[DAG] Bound loop dependence check in merge optimization.
The loop dependence check looks for dependencies between store merge
candidates not captured by the chain sub-DAG doing a check of
predecessors which may be very large. Conservatively bound number of
nodes checked for compilation time. (Resolves PR34326).
Landing on behalf of Nirav Dave to unblock the 5.0.0 release.
Differential Revision: https://reviews.llvm.org/D37220
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@312041 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r311623 | hans | 2017-08-23 18:08:27 -0700 (Wed, 23 Aug 2017) | 11 lines
[DAG] Fix Node Replacement in PromoteIntBinOp
When one operand is a user of another in a promoted binary operation
we may replace and delete the returned value before returning
triggering an assertion. Reorder node replacements to prevent this.
Fixes PR34137.
Landing on behalf of Nirav.
Differential Revision: https://reviews.llvm.org/D36581
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311670 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r311429 | ctopper | 2017-08-21 22:40:17 -0700 (Mon, 21 Aug 2017) | 9 lines
[X86] Prevent several calls to ISD::isConstantSplatVector from returning a narrower APInt than the original scalar type
ISD::isConstantSplatVector can shrink to the smallest splat width. But we don't check the size of the resulting APInt at all. This can cause us to misinterpret the results.
This patch just adds a flag to prevent the APInt from changing width.
Fixes PR34271.
Differential Revision: https://reviews.llvm.org/D36996
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311462 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r311071 | eladcohen | 2017-08-17 01:06:36 -0700 (Thu, 17 Aug 2017) | 13 lines
[SelectionDAG] Teach the vector-types operand scalarizer about SETCC
When v1i1 is legal (e.g. AVX512) the legalizer can reach
a case where a v1i1 SETCC with an illgeal vector type operand
wasn't scalarized (since v1i1 is legal) but its operands does
have to be scalarized. This used to assert because SETCC was
missing from the vector operand scalarizer.
This patch attemps to teach the legalizer to handle these cases
by scalazring the operands, converting the node into a scalar
SETCC node.
Differential revision: https://reviews.llvm.org/D36651
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311409 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r310979 | qcolombet | 2017-08-15 17:17:05 -0700 (Tue, 15 Aug 2017) | 38 lines
[VirtRegRewriter] Properly model the register liveness on undef subreg definition
Undef subreg definition means that the content of the super register
doesn't matter at this point. While that's true for virtual registers,
this may not hold when replacing them with actual physical registers.
Indeed, some part of the physical register may be coalesced with the
related virtual register and thus, the values for those parts matter and
must be live.
The fix consists in checking whether or not subregs of the physical register
being assigned to an undef subreg definition are live through that def and
insert an implicit use if they are. Doing so, will keep them alive until
that point like they should be.
E.g., let vreg14 being assigned to R0_R1 then
%vreg14:gsub_0<def,read-undef> = COPY %R0 ; <-- R1 is still live here
%vreg14:gsub_1<def> = COPY %R1
Before this changes, the rewriter would change the code into:
%R0<def> = KILL %R0, %R0_R1<imp-def> ; <-- this tells R1 is redefined
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use> ; this value of this R1
; is believed to come
; from the previous
; instruction
Because of this invalid liveness, later pass could make wrong choices and in
particular clobber live register as it happened with the register scavenger in
llvm.org/PR34107
Now we would generate:
%R0<def> = KILL %R0, %R0_R1<imp-def>, %R0_R1<imp-use> ; This tells R1 needs to
; reach this point
%R1<def> = KILL %R1, %R0_R1<imp-def>, %R0_R1<imp-use>
The bug has been here forever, it got exposed recently because the register
scavenger got smarter.
Fixes llvm.org/PR34107
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@311108 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r310552 | eladcohen | 2017-08-10 00:44:23 -0700 (Thu, 10 Aug 2017) | 19 lines
[SelectionDAG] When scalarizing vselect, don't assert on
a legal cond operand.
When scalarizing the result of a vselect, the legalizer currently expects
to already have scalarized the operands. While this is true for the true/false
operands (which have the same type as the result), it is not case for the
condition operand. On X86 AVX512, v1i1 is legal - this leads to operations such
as '< N x type> vselect < N x i1> < N x type> < N x type>' where < N x type > is
illegal to hit an assertion during the scalarization.
The handling is similar to r205625.
This also exposes the fact that (v1i1 extract_subvector) should be legal
and selectable on AVX512 - We do this by custom lowering to vector_extract_elt.
This still leaves us in some cases with redundant dag nodes which will be
combined in a separate soon to come patch.
This fixes pr33349.
Differential revision: https://reviews.llvm.org/D36511
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@310635 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r309651 | inouehrs | 2017-07-31 20:32:15 -0700 (Mon, 31 Jul 2017) | 16 lines
[StackColoring] Update AliasAnalysis information in stack coloring pass
Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types.
Actually, there is a FIXME comment in StackColoring.cpp
// FIXME: In order to enable the use of TBAA when using AA in CodeGen,
// we'll also need to update the TBAA nodes in MMOs with values
// derived from the merged allocas.
But, TBAA has been already enabled in CodeGen without fixing this pass.
The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling.
Although we observed the problem on ppc64le, this is a platform neutral issue.
This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots.
------------------------------------------------------------------------
------------------------------------------------------------------------
r309849 | inouehrs | 2017-08-02 11:16:32 -0700 (Wed, 02 Aug 2017) | 19 lines
[StackColoring] Update AliasAnalysis information in stack coloring pass (part 2)
This patch is update after the first patch (https://reviews.llvm.org/rL309651) based on the post-commit comments.
Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types.
Actually, there is a FIXME comment in StackColoring.cpp
// FIXME: In order to enable the use of TBAA when using AA in CodeGen,
// we'll also need to update the TBAA nodes in MMOs with values
// derived from the merged allocas.
But, TBAA has been already enabled in CodeGen without fixing this pass.
The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling.
Although we observed the problem on ppc64le, this is a platform neutral issue.
This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots.
This patch fixes PR33928.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309957 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r309930 | sdardis | 2017-08-03 02:38:46 -0700 (Thu, 03 Aug 2017) | 19 lines
[SelectionDAG] Resolve PR33978.
rL306209 taught SelectionDAG how to add the dereferenceable flag when
expanding memcpy and memmove. The fix however contained a nit where
the offset + size was constructed as an APInt of PointerSize rather
than PointerSizeInBits.
This lead to isDereferenceableAndAlignedPointer() get truncated values or
values which would be sign extended within that function leading to
incorrect results.
Thanks to Alex Crichton for reporting the issue!
This resolves PR33978.
Reviewers: inouehrs
Differential Revision: https://reviews.llvm.org/D36236
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309956 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r309561 | sdardis | 2017-07-31 07:06:58 -0700 (Mon, 31 Jul 2017) | 14 lines
[SelectionDAG][mips] Fix PR33883
PR33883 shows that calls to intrinsic functions should not have their vector
arguments or returns subject to ABI changes required by the target.
This resolves PR33883.
Thanks to Alex Crichton for reporting the issue!
Reviewers: zoran.jovanovic, atanasyan
Differential Revision: https://reviews.llvm.org/D35765
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309767 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r309422 | rnk | 2017-07-28 12:48:40 -0700 (Fri, 28 Jul 2017) | 25 lines
Fix conditional tail call branch folding when both edges are the same
The conditional tail call logic did the wrong thing when both
destinations of a conditional branch were the same:
BB#1: derived from LLVM BB %entry
Live Ins: %EFLAGS
Predecessors according to CFG: BB#0
JE_1 <BB#5>, %EFLAGS<imp-use,kill>
JMP_1 <BB#5>
BB#5: derived from LLVM BB %sw.epilog
Predecessors according to CFG: BB#1
TCRETURNdi64 <ga:@mergeable_conditional_tailcall>, 0, ...
We would fold the JE_1 to a TCRETURNdi64cc, and then remove our BB#5
successor. Then BB#5 would be deleted as it had no predecessors, leaving
a dangling "JMP_1 <BB#5>" reference behind to cause assertions later.
This patch checks that both conditional branch destinations are
different before doing the transform. The standard branch folding logic
is able to remove both the JMP_1 and the JE_1, and for my test case we
end up forming a better conditional tail call later.
Fixes PR33980
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309574 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r309302 | rksimon | 2017-07-27 11:15:54 -0700 (Thu, 27 Jul 2017) | 3 lines
[SelectionDAG] Improve DAGTypeLegalizer::convertMask assertion (PR33960)
Improve DAGTypeLegalizer::convertMask's isSETCCorConvertedSETCC assertion to properly check for any mixture of SETCC or BUILD_VECTOR of constants, or a logical mask op of them.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309348 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r308808 | arsenm | 2017-07-21 16:56:13 -0700 (Fri, 21 Jul 2017) | 6 lines
RA: Remove assert on empty live intervals
This is possible if there is an undef use when
splitting the vreg during spilling.
Fixes bug 33620.
------------------------------------------------------------------------
------------------------------------------------------------------------
r308813 | arsenm | 2017-07-21 17:24:01 -0700 (Fri, 21 Jul 2017) | 6 lines
RA: Remove another assert on empty intervals
This case is similar to the one fixed in r308808,
except when rematerializing.
Fixes bug 33884.
------------------------------------------------------------------------
------------------------------------------------------------------------
r308906 | arsenm | 2017-07-24 11:07:55 -0700 (Mon, 24 Jul 2017) | 6 lines
RA: Replace asserts related to empty live intervals
These don't exactly assert the same thing anymore, and
allow empty live intervals with non-empty uses.
Removed in r308808 and r308813.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309171 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r308891 | d0k | 2017-07-24 09:18:09 -0700 (Mon, 24 Jul 2017) | 16 lines
[CodeGenPrepare] Cut off FindAllMemoryUses if there are too many uses.
This avoids excessive compile time. The case I'm looking at is
Function.cpp from an old version of LLVM that still had the giant memcmp
string matcher in it. Before r308322 this compiled in about 2 minutes,
after it, clang takes infinite* time to compile it. With this patch
we're at 5 min, which is still bad but this is a pathological case.
The cut off at 20 uses was chosen by looking at other cut-offs in LLVM
for user scanning. It's probably too high, but does the job and is very
unlikely to regress anything.
Fixes PR33900.
* I'm impatient and aborted after 15 minutes, on the bug report it was
killed after 2h.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_50@309131 91177308-0d34-0410-b5e6-96231b3b80d8
Allowing cycles in Phi traversal increases the scope of optimize memory instruction
in case we are in loop.
The added test shows an example of enabling optimization inside a loop.
Reviewers: loladiro, spatel, efriedma
Reviewed By: efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35294
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308419 91177308-0d34-0410-b5e6-96231b3b80d8
DIImportedEntity has a line number, but not a file field. To determine
the decl_line/decl_file we combine the line number from the
DIImportedEntity with the file from the DIImportedEntity's scope. This
does not work correctly when the parent scope is a DINamespace or a
DIModule, both of which do not have a source file.
This patch adds a file field to DIImportedEntity to unambiguously
identify the source location of the using/import declaration. Most
testcase updates are mechanical, the interesting one is the removal of
the FIXME in test/DebugInfo/Generic/namespace.ll.
This fixes PR33822. See https://bugs.llvm.org/show_bug.cgi?id=33822
for more context.
<rdar://problem/33357889>
https://bugs.llvm.org/show_bug.cgi?id=33822
Differential Revision: https://reviews.llvm.org/D35583
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308398 91177308-0d34-0410-b5e6-96231b3b80d8
Re-recommiting after landing DAG extension-crash fix.
Recommiting after adding check to avoid miscomputing alias information
on addresses of the same base but different subindices.
Memory accesses offset from frame indices may alias, e.g., we
may merge write from function arguments passed on the stack when they
are contiguous. As a result, when checking aliasing, we consider the
underlying frame index's offset from the stack pointer.
Static allocs are realized as stack objects in SelectionDAG, but its
offset is not set until post-DAG causing DAGCombiner's alias check to
consider access to static allocas to frequently alias. Modify isAlias
to consider access between static allocas and access from other frame
objects to be considered aliasing.
Many test changes are included here. Most are fixes for tests which
indirectly relied on our aliasing ability and needed to be modified to
preserve their original intent.
The remaining tests have minor improvements due to relaxed
ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll
which has a minor degradation dispite though the pre-legalized DAG is
improved.
Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand
Reviewed By: rnk
Subscribers: sdardis, nemanjai, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33345
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308350 91177308-0d34-0410-b5e6-96231b3b80d8
Reorder replacements to be user first in preparation for multi-level
folding to premptively avoid inadvertantly deleting later nodes from
sharing found from replacement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308348 91177308-0d34-0410-b5e6-96231b3b80d8
When replacing a node and it's operand, replacing the operand node may
cause the deletion of the original node leading to an assertion
failure. Case around these replacements to avoid this without relying
on inspecting the DELETED_NODE opcode in various extend
dagcombiner cases.
Fixes PR32515.
Reviewers: dbabokin, RKSimon, davide, chandlerc
Subscribers: chandlerc, llvm-commits
Differential Revision: https://reviews.llvm.org/D34095
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308330 91177308-0d34-0410-b5e6-96231b3b80d8
Treat widening G_SREM and G_UREM the same as G_SDIV and G_UDIV. This is
going to be used in the ARM backend (and that's when the test will come
too).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308278 91177308-0d34-0410-b5e6-96231b3b80d8
optimizeMemoryInst contains a vector AddrModeInsts.
The only use of this vector is to check that all instructions are in the same
block as memory instruction. This check is guarded by PhiSeen flag,
so if we traversed through phi node then we do not need to keep information
in AddrModeInsts. AddModeInsts is set first time we found some addressing mode
and updated if we found new one later.
We can find next addressing mode only if we traverse phi node so all code
related to update of AddModeInsts can be safely removed.
Reviewers: loladiro, spatel, efriedma
Reviewed By: efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35291
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308265 91177308-0d34-0410-b5e6-96231b3b80d8
Now, getUserCost() only checks the src and dst types of EXT to decide it is free
or not. This change first checks the types, then calls isExtFreeImpl(), and
check if EXT can form ExtLoad at last. Currently, only AArch64 has customized
implementation of isExtFreeImpl() to check if EXT can be folded into its use.
Differential Revision: https://reviews.llvm.org/D34458
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308076 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
DominatorTreeBase used to have IsPostDominators (bool) member to indicate if the tree is a dominator or a postdominator tree. This made it possible to switch between the two 'modes' at runtime, but it isn't used in practice anywhere.
This patch makes IsPostDominator a template argument. This way, it is easier to switch between different algorithms at compile-time based on this argument and design external utilities around it. It also makes it impossible to incidentally assign a postdominator tree to a dominator tree (and vice versa), and to further simplify template code in GenericDominatorTreeConstruction.
Reviewers: dberlin, sanjoy, davide, grosser
Reviewed By: dberlin
Subscribers: mzolotukhin, llvm-commits
Differential Revision: https://reviews.llvm.org/D35315
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308040 91177308-0d34-0410-b5e6-96231b3b80d8
Recommiting after adding check to avoid miscomputing alias information
on addresses of the same base but different subindices.
Memory accesses offset from frame indices may alias, e.g., we
may merge write from function arguments passed on the stack when they
are contiguous. As a result, when checking aliasing, we consider the
underlying frame index's offset from the stack pointer.
Static allocs are realized as stack objects in SelectionDAG, but its
offset is not set until post-DAG causing DAGCombiner's alias check to
consider access to static allocas to frequently alias. Modify isAlias
to consider access between static allocas and access from other frame
objects to be considered aliasing.
Many test changes are included here. Most are fixes for tests which
indirectly relied on our aliasing ability and needed to be modified to
preserve their original intent.
The remaining tests have minor improvements due to relaxed
ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll
which has a minor degradation dispite though the pre-legalized DAG is
improved.
Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand
Reviewed By: rnk
Subscribers: sdardis, nemanjai, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D33345
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@308025 91177308-0d34-0410-b5e6-96231b3b80d8
For multiprecision arithmetic on MIPS, rather than using ISD::ADDE / ISD::ADDC,
get SelectionDAG to break down the operation into ISD::ADDs and ISD::SETCCs.
For MIPS, only the DSP ASE has a carry flag, so in the general case it is not
useful to directly support ISD::{ADDE, ADDC, SUBE, SUBC} nodes.
Also improve the generation code in such cases for targets with
TargetLoweringBase::ZeroOrOneBooleanContent by directly using the result of the
comparison node rather than using it in selects. Similarly for ISD::SUBE /
ISD::SUBC.
Address optimization breakage by moving the generation of MIPS specific integer
multiply-accumulate nodes to before legalization.
This revolves PR32713 and PR33424.
Thanks to Simonas Kazlauskas and Pirama Arumuga Nainar for reporting the issue!
Reviewers: slthakur
Differential Revision: https://reviews.llvm.org/D33494
The previous version of this patch was too aggressive in producing fused
integer multiple-addition instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307906 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Add target hooks for printing and parsing target MMO flags.
Targets may override getSerializableMachineMemOperandTargetFlags() to
return a mapping from string to flag value for target MMO values that
should be serialized/parsed in MIR output.
Add implementation of this hook for AArch64 SuppressPair MMO flag.
Reviewers: bogner, hfinkel, qcolombet, MatzeB
Subscribers: mcrosier, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D34962
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307877 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Some programs run into a stack overflow issue. This change avoids this
problem by replacing the recursive algorithm with the iterative version.
Reviewers: MatzeB, t.p.northover, dblaikie
Reviewed By: MatzeB
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35105
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307860 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memset intrinsic. This intrinsic is essentially memset with the implementation requirement that all stores used for the assignment are done with unordered-atomic stores of a given element size.
Reviewers: eli.friedman, reames, mkazantsev, skatkov
Reviewed By: reames
Subscribers: jfb, dschuff, sbc100, jgravelle-google, aheejin, efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D34885
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307854 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memmove intrinsic. This intrinsic is essentially memmove with the implementation requirement that all loads/stores used for the copy are done with unordered-atomic loads/stores of a given element size.
Reviewers: eli.friedman, reames, mkazantsev, skatkov
Reviewed By: reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34884
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307796 91177308-0d34-0410-b5e6-96231b3b80d8
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
global and local memory. These scopes restrict how synchronization is
achieved, which can result in improved performance.
This change extends existing notion of synchronization scopes in LLVM to
support arbitrary scopes expressed as target-specific strings, in addition to
the already defined scopes (single thread, system).
The LLVM IR and MIR syntax for expressing synchronization scopes has changed
to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
replaces *singlethread* keyword), or a target-specific name. As before, if
the scope is not specified, it defaults to CrossThread/System scope.
Implementation details:
- Mapping from synchronization scope name/string to synchronization scope id
is stored in LLVM context;
- CrossThread/System and SingleThread scopes are pre-defined to efficiently
check for known scopes without comparing strings;
- Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
the bitcode.
Differential Revision: https://reviews.llvm.org/D21723
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307722 91177308-0d34-0410-b5e6-96231b3b80d8
This is a second attempt to land this patch.
The first one resulted in a crash of clang sanitizer buildbot.
The fix is here and regression test is added.
This is a last fix for the corner case of PR32214. Actually this is not really corner case in general.
We should not do a loop rotation if we create an additional branch due to it.
Consider the case where we have a loop chain H, M, B, C , where
H is header with viable fallthrough from pre-header and exit from the loop
M - some middle block
B - backedge to Header but with exit from the loop also.
C - some cold block of the loop.
Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch.
Let's compute the change in number of branches:
+1 branch from pre-header to header
-1 branch from header to exit
+1 branch from header to middle block if there is such
-1 branch from cold bock to header if there is one
So if C is not a predecessor of H then we introduce extra branch.
This change actually prohibits rotation of the loop if both true
Best Exit has next element in chain as successor.
Last element in chain is not a predecessor of first element of chain.
Reviewers: iteratee, xur, sammccall, chandlerc
Reviewed By: iteratee
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34745
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307631 91177308-0d34-0410-b5e6-96231b3b80d8
CodeGenPrepare::optimizeMemoryInst contains a check that we do nothing
if all instructions combining the address for memory instruction is in the same
block as memory instruction itself.
However if any of these instruction are placed after memory instruction then
address calculation will not be folded to memory instruction.
The added test case shows an example.
Reviewers: loladiro, spatel, efriedma
Reviewed By: efriedma
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D34862
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307628 91177308-0d34-0410-b5e6-96231b3b80d8