Ensure we keep track of the input vectors in all cases instead of just for SK_Select.
Ideally we'd reuse the shuffle mask pattern matching in TargetTransformInfo::getInstructionThroughput here to easily add support for all TargetTransformInfo::ShuffleKind without mass code duplication, I've added a TODO for now but D48236 should help us here.
Differential Revision: https://reviews.llvm.org/D48023
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334958 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds instructions for comparing elements from two vectors, e.g.
cmpgt p0.s, p0/z, z0.s, z1.s
and also adds support for comparing to a 64-bit wide element vector, e.g.
cmpgt p0.s, p0/z, z0.s, z1.d
The patch also contains aliases for certain comparisons, e.g.:
cmple p0.s, p0/z, z0.s, z1.s => cmpge p0.s, p0/z, z1.s, z0.s
cmplo p0.s, p0/z, z0.s, z1.s => cmphi p0.s, p0/z, z1.s, z0.s
cmpls p0.s, p0/z, z0.s, z1.s => cmphs p0.s, p0/z, z1.s, z0.s
cmplt p0.s, p0/z, z0.s, z1.s => cmpgt p0.s, p0/z, z1.s, z0.s
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334931 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we heap allocated the X86InstrFMA3Group objects which were created by passing them small register/memory opcode arrays that existed as individual static tables.
Rather than a bunch of small static arrays we now have one large static table of X86InstrFMA3Group objects. Rather than storing a pointer to the opcode arrays in the X86InstrFMA3Group object, we now store have a register and memory array as part of the object. If a group doesn't have memory or register opcodes, the array entries will be 0.
This greatly simplifies the destruction of the X86InstrFMA3Info object. We no longer need to delete the X86InstrFMA3Group objects as we destruct the DenseMap. And we don't need to keep track of which ones we already deleted.
This reduces the llc binary size on my local machine by ~50k. I can only assume that's really due to the fact that we had something like 512 small static arrays that we passed to the init functions either one at a time or in pairs. So there were between 256 and 512 distinct calls to the init functions in the initOnceImpl method.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334925 91177308-0d34-0410-b5e6-96231b3b80d8
We already have these aliases for EVEX enocded instructions, but not for the GPR, MMX, SSE, and VEX versions.
Also remove the vpextrw.s EVEX alias. That's not something gas implements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334922 91177308-0d34-0410-b5e6-96231b3b80d8
The .s assembly strings allow the reversed forms to be targeted from assembly which matches gas behavior. But when printing the instructions we should print them without the .s to match other tooling like objdump. By using InstAliases we can use the normal string in the instruction and just hide it from the assembly parser.
Ideally we'd add the .s versions to the legacy SSE and VEX versions as well for full compatibility with gas. Not sure how we got to state where only EVEX was supported.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334920 91177308-0d34-0410-b5e6-96231b3b80d8
These increases the size of the static tables, but is closer to what we would get if used the autogenerated table directly. This reduces the remaining large deltas between what's in the manual table and what's in the autogenerated table.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334915 91177308-0d34-0410-b5e6-96231b3b80d8
symbols in debug mode.
The MaterializationResponsibility class hijacks the Materializing flag to track
symbols that have not yet been resolved in order to guard against redundant
resolution. Since this is an API contract check and only enforced in debug mode
there is no reason to maintain the flag state in release mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334909 91177308-0d34-0410-b5e6-96231b3b80d8
Some of the calls to hasSingleUseFromRoot were passing the load itself. If the load's chain result has a user this would count against that. By getting the true parent of the match and ensuring any intermediate between the match and the load have a single use we can avoid this case. isLegalToFold will take care of checking users of the load's data output.
This fixed at least fma-scalar-memfold.ll to succed without the peephole pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334908 91177308-0d34-0410-b5e6-96231b3b80d8
Support for SVE's predicated select instructions to select elements
from either vector, both in a data-vector and a predicate-vector
variant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334905 91177308-0d34-0410-b5e6-96231b3b80d8
We don't want to prevent inlining because of target-cpu and -features
attributes that were added to newer versions of LLVM/Clang: There are
no incompatible functions in PTX, ptxas will throw errors in such cases.
Differential Revision: https://reviews.llvm.org/D47691
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334904 91177308-0d34-0410-b5e6-96231b3b80d8
These all have a short form encoding that the assembler already prefers. Though that preference seems to only be based on order in the .td fie. Hiding the long form saves space in the table and prevents us from breaking the implicit order based priority.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334897 91177308-0d34-0410-b5e6-96231b3b80d8
VMOVPQIto64Zmr is not a 64-bit mode only instruction. But I don't know how to test this because VMOVPQIto64mr should always have priority over it in 32-bit mode since its only advantage is XMM16-XMM31 which aren't usable in 32-bit mode.
VMOVPQIto64Zrr is a 64-bit mode only instruction, but we don't need to explicitly mark it as such because it uses a GR64 register which won't parse in 32-bit mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334896 91177308-0d34-0410-b5e6-96231b3b80d8
Relanding after fixing expensive check from modifying tables.
To avoid redundant work, during DAG legalization we keep tables
mapping pre-legalized SDValues to post-legalized SDValues and a
SDValue-to-SDValue map to enable fast node replacements. However, as
the keys are nodes which may be reused it is possible that an entry in
a table refers to a now deleted node N (that should have been renamed
by the value replacement map) while a new node N' exists. If N' is
then replaced that entry would be wrong. Previously we avoided this by
when potentially violating this property, walking every table and
updating all node pointers. This is very expensive but hopefully rare
occurance.
This patch assigns each instance of a SDValue used in legalization a
unique id and uses these ids in the legalization tables. This avoids
any such aliasing issue, avoiding the full table search and allowing
more aggressive incremental table pruning.
In some cases this is a 1000x speedup to compilation.
Reviewers: jyknight, echristo, bogner, tra
Reviewed By: bogner
Subscribers: dberris, grandinj, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D47959
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334880 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This patch originated from D47388 and is a proper subset of the originating changes, containing only the fmf optimization guard extensions.
Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar, rampitec, nhaehnle, nemanjai
Reviewed By: rampitec, nhaehnle
Subscribers: tpr, nemanjai, wdng
Differential Revision: https://reviews.llvm.org/D47918
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334876 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Obviates the need for mask/clear/setFlags helpers.
There are some expressions here which can be simplified, but to keep
this easy to review, I have not simplified them in this patch.
No functional change.
Reviewers: sanjoy
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D48237
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334874 91177308-0d34-0410-b5e6-96231b3b80d8
So far, we've only handled special cases of PatFrag like ImmLeaf. This patch
adds support for the remaining cases using similar mechanisms.
Like most C++ code from SelectionDAG, GISel and DAGISel expect to operate on
different types and representations and as such the code is not compatible
between the two. It's therefore necessary to add an alternative implementation
in the GISelPredicateCode field.
The target test for this feature could easily be done with IntImmLeaf and this
would save on a little boilerplate. The reason I've chosen to implement this
using PatFrag.GISelPredicateCode and not IntImmLeaf is because I was unable to
find a rule that was blocked solely by lack of support for PatFrag predicates. I
found that the ones I investigated as being likely candidates for the test
were further blocked by other things.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334871 91177308-0d34-0410-b5e6-96231b3b80d8
Not sure any of these matter today because I don't think we ever produce them with IMPLICIT_DEF as an input. But by listing them we don't be suprised in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334867 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This patch originated from D46562 and is a proper subset, with some issues addressed.
Reviewers: spatel, hfinkel, wristow, arsenm
Reviewed By: spatel
Subscribers: wdng, nhaehnle
Differential Revision: https://reviews.llvm.org/D47954
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334862 91177308-0d34-0410-b5e6-96231b3b80d8
Enables using the high and high-adjusted symbol modifiers on thread local
storage modifers in powerpc assembly. Needed to be able to support 64 bit
thread-pointer and dynamic-thread-pointer access sequences.
Differential Revision: https://reviews.llvm.org/D47754
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334856 91177308-0d34-0410-b5e6-96231b3b80d8
Add support for the "@high" and "@higha" symbol modifiers in powerpc64 assembly.
The modifiers represent accessing the segment consiting of bits 16-31 of a
64-bit address/offset.
Differential Revision: https://reviews.llvm.org/D47729
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334855 91177308-0d34-0410-b5e6-96231b3b80d8
An earlier commit prevented folds from the peephole pass by checking for IMPLICIT_DEF. But later in the pipeline IMPLICIT_DEF just becomes and Undef flag on the input register so we need to check for that case too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334848 91177308-0d34-0410-b5e6-96231b3b80d8
When coalescing a small register into a subregister of a larger register,
if the larger register is rematerialized, the function updateRegDefUses
can add an <undef> flag to the rematerialized definition (since it's
treating it as only definining the coalesced subregister). While with that
assumption doing so is not incorrect, make sure to remove the flag later
on after the call to updateRegDefUses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334845 91177308-0d34-0410-b5e6-96231b3b80d8