29251 Commits

Author SHA1 Message Date
Rahman Lavaee
3a9a3691ee [Propeller]: Use a descriptive temporary symbol name for the end of the basic block.
This patch changes the functionality of AsmPrinter to name the basic block end labels as LBB_END${i}_${j}, with ${i} being the identifier for the function and ${j} being the identifier for the basic block. The new naming scheme is consistent with how basic block labels are named (.LBB${i}_{j}), and how function end symbol are named (.Lfunc_end${i}) and helps to write stronger tests for the upcoming patch for BB-Info section (as proposed in https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html). The end label is used with basicblock-labels (BB-Info section in future) and basicblock-sections to compute the size of basic blocks and basic block sections, respectively. For BB sections, the section containing the entry basic block will not have a BB end label since it already gets the function end-label.
This label is cached for every basic block (CachedEndMCSymbol) like the label for the basic block (CachedMCSymbol).

Differential Revision: https://reviews.llvm.org/D83885
2020-08-05 13:17:19 -07:00
Denis Antrushin
f0a68d4ee0 [Statepoints] Operand folding in presense of tied registers.
Implement proper folding of statepoint meta operands (deopt and GC)
when statepoint uses tied registers.
For deopt operands it is just about properly preserving tiedness
in new instruction.
For tied GC operands folding is a little bit more tricky.
We can fold tied GC operands only from InlineSpiller, because it knows
how to properly reload tied def after it was turned into memory operand.
Other users (e.g. peephole) cannot properly fold such operands as they
do not know how (or when) to reload them from memory.
We do this by un-tieing operand we want to fold in InlineSpiller
and allowing to fold only untied operands in foldPatchpoint.
2020-08-05 20:18:28 +07:00
Simon Pilgrim
850778022a [DAG] Fold vector (aext (load x)) -> (zext (truncate (zextload x)))
We currently don't do anything to fold any_extend vector loads as no target has such an instruction.

Instead I've added support for folding to a zextload, SimplifyDemandedBits does a good job of adjusting the zext(truncate(()) stages as required later on.

We still need the custom scalar extload handling instead of using the tryToFoldExtOfLoad helper as it has different legality tests - we can probably tweak that to reduce most of the code duplication.

Fixes the regression I mentioned in rG99a971cadff7

Differential Revision: https://reviews.llvm.org/D85129
2020-08-05 11:22:23 +01:00
Georgii Rymar
af54a4ae5b [llvm-readobj/elf] - Add a testing for --stackmap and refine the implementation.
Currently, we only test the `--stackmap` option here:
https://github.com/llvm/llvm-project/blob/master/llvm/test/Object/stackmap-dump.test
it uses a precompiled MachO binary currently and I've found no tests for this option for ELF.

The implementation also has issues. For example, it might assert on a wrong version
of the .llvm-stackmaps section. Or it might crash on an empty or truncated section.

This patch introduces a new tools/llvm-readobj/ELF test file as well as implements a few
basic checks to catch simple crashes/issues

It also eliminates `unwrapOrError` calls in `printStackMap()`.

Differential revision: https://reviews.llvm.org/D85208
2020-08-05 13:09:04 +03:00
Matt Arsenault
53bc9040d0 GlobalISel: Use buildAnyExtOrTrunc 2020-08-04 22:04:04 -04:00
Matt Arsenault
90be7019b8 GlobalISel: Simplify code
This cannot be a vector of pointers, so using getScalarSizeInBits just
added a bit extra noise.
2020-08-04 22:03:59 -04:00
Matt Arsenault
2fa463a27c GlobalISel: Fix redundant variable and shadowing 2020-08-04 22:03:55 -04:00
Matt Arsenault
dd7ad288a4 GlobalISel: Move load/store lowering to separate functions 2020-08-04 22:03:51 -04:00
Krzysztof Parzyszek
0ead9ed228 [RDF] Add operator<<(raw_ostream&, RegisterAggr), NFC 2020-08-04 18:40:07 -05:00
Krzysztof Parzyszek
242b3118d9 [RDF] Use hash-based containers, cache extra information
This improves performance.
2020-08-04 18:36:49 -05:00
Krzysztof Parzyszek
66ffc7ab04 [RDF] Really remove remaining uses of PhysicalRegisterInfo::normalize 2020-08-04 18:23:38 -05:00
Krzysztof Parzyszek
a88523630e [RDF] Cache register aliases in PhysicalRegisterInfo
This improves performance of PhysicalRegisterInfo::makeRegRef.
2020-08-04 18:10:00 -05:00
Krzysztof Parzyszek
6f283737e5 [RDF] Lower the sorting complexity in RDFLiveness::getAllReachingDefs
The sorting is needed, because reaching defs are (logically) ordered,
but are not collected in that order. This change will break up the
single call to std::sort into a series of smaller sorts, each of which
should use a cheaper comparison function than the original.
2020-08-04 18:06:37 -05:00
Eli Friedman
83274f02e1 [SelectionDAG][SVE] Support scalable vectors in getConstantFP()
Differential Revision: https://reviews.llvm.org/D85249
2020-08-04 15:32:43 -07:00
Krzysztof Parzyszek
3bf7627b97 [RDF] Remove uses of RDFRegisters::normalize (deprecate)
This function has been reduced to an identity function for some time.
2020-08-04 17:02:12 -05:00
Matt Arsenault
ba4d17c159 GlobalISel: Add utilty for getting function argument live ins
Get the argument register and ensure there's a copy to the virtual
register. AMDGPU and AArch64 have similarish code to get the livein
value, and I also want to use this in multiple places.

This is a bit more aggressive about setting the register class than
the original function, but that's probably OK.

I think we're missing a few verifier checks for function live ins. I
noticed AArch64's calling convention code is not actually adding
liveins to functions, only the entry block (which apparently might not
matter that much?). There should probably be a verifier check that
entry block live ins are also live into the function. We also might
need a verifier check that the copy to the livein virtual register is
in the entry block.
2020-08-04 16:55:55 -04:00
Cameron McInally
640b249fd9 [FastISel] Don't transform FSUB(-0, X) -> FNEG(X) in FastISel
This corresponds with the SelectionDAGISel change in D84056.

Also, rename some poorly named tests in CodeGen/X86/fast-isel-fneg.ll with NFC.

Differential Revision: https://reviews.llvm.org/D85149
2020-08-04 14:42:53 -05:00
Matt Arsenault
a816bda54e GlobalISel: Handle llvm.localescape
This one is pretty easy and shrinks the list of unhandled
intrinsics. I'm not sure how relevant the insert point is. Using the
insert position of EntryBuilder will place this after
constants. SelectionDAG seems to end up emitting these after argument
copies and before anything else, but I don't think it really
matters. This also ends up emitting these in the opposite order from
SelectionDAG, but I don't think that matters either.

This also needs a fix to stop the later passes dropping this as a dead
instruction. DeadMachineInstructionElim's version of isDead special
cases LOCAL_ESCAPE for some reason, and I'm not sure why it's excluded
from MachineInstr::isLabel (or why isDead doesn't check it).

I also noticed DeadMachineInstructionElim never considers inline asm
as dead, but GlobalISel will drop asm with no constraints.
2020-08-04 15:19:02 -04:00
Cameron McInally
0f45fb4bbc [GlobalISel] Don't transform FSUB(-0, X) -> FNEG(X) in GlobalISel.
This patch stops unconditionally transforming FSUB(-0, X) into an FNEG(X) while building the MIR.

This corresponds with the SelectionDAGISel change in D84056.

Differential Revision: https://reviews.llvm.org/D85139
2020-08-04 11:27:09 -05:00
Jay Foad
bd8f1be276 [PowerPC] Custom lowering for funnel shifts
The custom lowering saves an instruction over the generic expansion, by
taking advantage of the fact that PowerPC shift instructions are well
defined in the shift-by-bitwidth case.

Differential Revision: https://reviews.llvm.org/D83948
2020-08-04 16:30:49 +01:00
Sander de Smalen
d910125a2c [AArch64][SVE] Fix CFA calculation in presence of SVE objects.
The CFA is calculated as (SP/FP + offset), but when there are
SVE objects on the stack the SP offset is partly scalable and
should instead be expressed as the DWARF expression:

     SP + offset + scalable_offset * VG

where VG is the Vector Granule register, containing the
number of 64bits 'granules' in a scalable vector.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D84043
2020-08-04 11:47:06 +01:00
Fangrui Song
6fcac4f990 [MC] Set sh_link to 0 if the associated symbol is undefined
Part of https://bugs.llvm.org/show_bug.cgi?id=41734

LTO can drop externally available definitions. Such AssociatedSymbol is
not associated with a symbol. ELFWriter::writeSection() will assert.

Allow a SHF_LINK_ORDER section to have sh_link=0.

We need to give sh_link a syntax, a literal zero in the linked-to symbol
position, e.g. `.section name,"ao",@progbits,0`

Reviewed By: pcc

Differential Revision: https://reviews.llvm.org/D72899
2020-08-03 13:43:48 -07:00
Jon Roelofs
040e0990b8 Fix typo: s/epomymous/eponymous/ NFC 2020-08-03 14:09:46 -06:00
Cameron McInally
60134c9850 [FPEnv] Don't transform FSUB(-0,X)->FNEG(X) in SelectionDAGBuilder.
This patch stops unconditionally transforming FSUB(-0,X) into an FNEG(X) while building the DAG. There is also one small change to handle the new FSUB(-0,X) similarly to FNEG(X) in the AMDGPU backend.

Differential Revision: https://reviews.llvm.org/D84056
2020-08-03 10:22:25 -05:00
Matt Arsenault
c177fc936f GlobalISel: Handle arbitrary FewerElementsVector for G_IMPLICIT_DEF 2020-08-03 09:14:08 -04:00
Matt Arsenault
7371d3a454 GlobalISel: Reimplement moreElementsVectorDst
Use pad with undef and unmerge with unused results. This is annoyingly
similar to several other places in LegalizerHelper, but they're all
slightly different.
2020-08-03 09:03:48 -04:00
Igor Kudrin
7228215a61 [DebugInfo] Make DIEDelta::SizeOf() more explicit. NFCI.
The patch restricts DIEDelta::SizeOf() to accept only DWARF forms that
are actually used in the LLVM codebase. This should make the use of the
class more explicit and help to avoid issues similar to fixed in D83958
and D84094.

Differential Revision: https://reviews.llvm.org/D84095
2020-08-03 15:04:15 +07:00
Igor Kudrin
46e237eb24 [DebugInfo] Fix misleading using of DWARF forms with DIELabel. NFCI.
DIELabel can emit only 32- or 64-bit values, while it was created in
some places with DW_FORM_udata, which implies emitting uleb128.
Nevertheless, these places also expected to emit U32 or U64, but just
used a misleading DWARF form. The patch updates those places to use more
appropriate DWARF forms and restricts DIELabel::SizeOf() to accept only
forms that are actually used in the LLVM codebase.

Differential Revision: https://reviews.llvm.org/D84094
2020-08-03 15:04:08 +07:00
Igor Kudrin
a1f8fa9c5c [DebugInfo] Fix a comment and a variable name. NFC.
DebugLocListIndex keeps the index of an entry list, not the offset.

Differential Revision: https://reviews.llvm.org/D84093
2020-08-03 15:04:00 +07:00
Igor Kudrin
0a9d458c38 [DebugInfo] Make DIELocList::SizeOf() more explicit. NFCI.
DIELocList is used with a limited number of DWARF forms, see the only
place where it is instantiated, DwarfCompileUnit::addLocationList().

The patch marks the unexpected execution path in DIELocList::SizeOf()
as unreachable, to reduce ambiguity.

Differential Revision: https://reviews.llvm.org/D84092
2020-08-03 15:03:37 +07:00
Matt Arsenault
e7dc03b43a GlobalISel: Implement bitcast action for G_EXTRACT_VECTOR_ELEMENT
For AMDGPU, vectors with elements < 32 bits should be indexed in
32-bit elements and the desired bits extracted from there. For
elements > 64-bits, these should be reduce to 64/32 elements to enable
the normal dynamic indexing paths.

In the dynamic index cases, this produces shorter code most of the
time. This does immediately regress the constant index cases, but this
should be fixed once we have the most basic of shift combines.

The element size > 64 case is pretty much ported from the exisiting
DAG implementation for extract element promote. The increasing element
size case is new.
2020-08-02 10:42:07 -04:00
Simon Pilgrim
2fc1176955 [DAG] TargetLowering::expandMUL_LOHI - pass SDLoc as const&
Try to be more consistent with the SDLoc param in the TargetLowering methods.

This also exposes an issue where we were passing a SDNode as a SDLoc, relying on the implicit SDLoc(SDNode) constructor.
2020-08-02 15:31:36 +01:00
Simon Pilgrim
c5bd9ad18f [DAG] TargetLowering::LowerAsmOutputForConstraint - pass SDLoc as const&
Try to be more consistent with the SDLoc param in the TargetLowering methods.
2020-08-02 15:12:02 +01:00
Kazu Hirata
6e4cee6f1e Use llvm::is_contained where appropriate (NFC)
Use llvm::is_contained where appropriate (NFC)

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D85083
2020-08-01 21:51:06 -07:00
Evgeny Leviant
f0f5988a8b [MachineVerifier] Refactor calcRegsPassed. NFC
Patch improves performance of verify-machineinstrs pass up to 10x.
Differential revision: https://reviews.llvm.org/D84105
2020-08-01 12:58:52 +03:00
Sriraman Tallam
c09ff709a2 Rename basic block sections options to be consistent.
D68049 created options for basic block sections: -fbasic-block-sections=,
-funique-basic-block-section-names. Rename options in llc and lld (--lto-)
to be consistent. Specifically,

+ Rename basicblock-sections to basic-block-sections
+ Rename unique-bb-section-names to unique-basic-block-section-names

Differential Revision: https://reviews.llvm.org/D84462
2020-07-31 11:50:55 -07:00
Aditya Nandakumar
313711b2cf [GISel] Add combiners for G_INTTOPTR and G_PTRTOINT
https://reviews.llvm.org/D84909

Patch adds two new GICombinerRules, one for G_INTTOPTR and one for
G_PTRTOINT. The G_INTTOPTR elides ptr2int(int2ptr(x)) to a copy of x, if
the cast is within the same address space. The G_PTRTOINT elides
int2ptr(ptr2int(x)) to a copy of x. Patch additionally adds new combiner
tests for the AArch64 target to test these new combiner rules.

Patch by mkitzan
2020-07-31 10:13:36 -07:00
Matt Arsenault
4eb4bb060f Support addrspacecast initializers with isNoopAddrSpaceCast
Moves isNoopAddrSpaceCast to the TargetMachine. It logically belongs
with the DataLayout.
2020-07-31 10:42:43 -04:00
Vitaly Buka
1bae08d2a5 [NFC] Remove unused GetUnderlyingObject paramenter
Depends on D84617.

Differential Revision: https://reviews.llvm.org/D84621
2020-07-31 02:10:03 -07:00
Vitaly Buka
4ee4573a60 [NFC] GetUnderlyingObject -> getUnderlyingObject
I am going to touch them in the next patch anyway
2020-07-30 21:08:24 -07:00
Eli Friedman
244f68004d [LegalizeTypes][SVE] Support widen/split legalization for SPLAT_VECTOR
Just the obvious implementation that rewrites the result type. Also fix
warning from EXTRACT_SUBVECTOR legalization that triggers on the test.

Differential Revision: https://reviews.llvm.org/D84706
2020-07-30 16:17:45 -07:00
Jon Roelofs
50a1ea2ba8 [SelectionDAG] Fix lowering of vector geps
This fixes an assertion failure that was being triggered in
SelectionDAG::getZeroExtendInReg(), where it was trying to extend the <2xi32>
to i64 (which should have been <2xi64>).

Fixes: rdar://66016901

Differential Revision: https://reviews.llvm.org/D84884
2020-07-30 14:56:53 -06:00
Brendon Cahoon
4b5fd94277 Align store conditional address
In cases where the alignment of the datatype is smaller than
expected by the instruction, the address is aligned. The aligned
address is used for the load, but wasn't used for the store
conditional, which resulted in a run-time alignment exception.
2020-07-30 10:42:00 -05:00
jasonliu
741ec7aba1 [XCOFF][AIX] Enable -ffunction-sections
Summary:
This patch implements -ffunction-sections on AIX.
This patch focuses on assembly generation.
Follow-on patch needs to handle:
1. -ffunction-sections implication for jump table.
2. Object file generation path and associated testing.

Differential Revision: https://reviews.llvm.org/D83875
2020-07-30 13:30:01 +00:00
Sam Tebbs
6aafd76482 [DAGCombiner] Fold sext_inreg of a masked load into a sign extended masked load
This patch adds a DAG combine fold for a sext(masked_load) into a sign extended masked load.

Differential Revision: https://reviews.llvm.org/D84332
2020-07-30 10:34:02 +01:00
Kang Zhang
29be334f5a [PHIElimination] Fix the killed flag for LowerPHINode()
Summary:
In the phi-node-elimination pass, we set the killed flag incorrectly.
When we eliminate the PHI node, we replace the PHI with a copy for the
incoming value.

Before this patch, we will set incoming value as killed(PHICopy). And
we will remove the killed flag from last using incoming value(OldKill).
This is correct, only if the new PHICopy is after the OldKill.

Reviewed By: bjope

Differential Revision: https://reviews.llvm.org/D80886
2020-07-30 08:18:50 +00:00
Matt Arsenault
c9d9ecfba5 GlobalISel: Use result of find rather than rechecking map 2020-07-29 21:26:20 -04:00
Matt Arsenault
71c6e8a505 GlobalISel: Handle assorted no-op intrinsics
SelectionDAGBuilder just drops these, so do the same.
2020-07-29 21:26:20 -04:00
Matt Arsenault
e2b102c48a GlobalISel: Handle llvm.roundeven
I still think it's highly questionable that we have two intrinsics
with identical behavior and only vary by the name of the libcall used
if it happens to be lowered that way, but try to reduce the feature
delta between SDAG and GlobalISel for recently added intrinsics. I'm
not sure which opcode should be considered the canonical one, but
lower roundeven back to round.
2020-07-29 20:01:12 -04:00
Philip Reames
f22040ec6c [Statepoint] Enable cross block relocates w/vreg lowering
This change is mechanical, it just removes the restriction and updates tests.  The key building blocks were submitted in 31342eb and 8fe2abc.

Note that this (and preceeding changes) entirely subsumes D83965.  I did includes a couple of it's tests.

From the codegen changes, an interesting observation: this doesn't actual reduce spilling, it just let's the register allocator do it's job.  That results in a slightly different overall result which has both pros and cons over the eager spill lowering.  (i.e. We'll have some perf tuning to do once this is stable.)
2020-07-29 13:32:51 -07:00