In AVX 256bit vectors are valid vectors and therefore the Type Legalizer doesn't
split the VSELECT and SETCC nodes. AVX only supports MIN/MAX on 128bit vectors
and this fix enables vector splitting for this special case in the X86 DAG
Combiner.
This fix is related to PR16695, PR17002, and <rdar://problem/14594431>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191131 91177308-0d34-0410-b5e6-96231b3b80d8
The Type Legalizer recognizes that VSELECT needs to be split, because the type
is to wide for the given target. The same does not always apply to SETCC,
because less space is required to encode the result of a comparison. As a result
VSELECT is split and SETCC is unrolled into scalar comparisons.
This commit fixes the issue by checking for VSELECT-SETCC patterns in the DAG
Combiner. If a matching pattern is found, then the result mask of SETCC is
promoted to the expected vector mask for the given target. This mask has usually
te same size as the VSELECT return type (except for Intel KNL). Now the type
legalizer will split both VSELECT and SETCC.
This allows the following X86 DAG Combine code to sucessfully detect the MIN/MAX
pattern. This fixes PR16695, PR17002, and <rdar://problem/14594431>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191130 91177308-0d34-0410-b5e6-96231b3b80d8
The global registry is used to allow command line override of the
scheduler selection, but does not work well as the normal selection
API. For example, the same LLVM process should be able to target
multiple targets or subtargets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191071 91177308-0d34-0410-b5e6-96231b3b80d8
When selecting the DAG (add (WrapperRIP ...), (FrameIndex ...)), X86 code had
spotted the FrameIndex possibility and was working out whether it could fold
the WrapperRIP into this.
The test for forming a %rip version is notionally whether we already have a
base or index register (%rip precludes both), but we were forgetting to account
for the register that would be inserted later to access the frame.
rdar://problem/15024520
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190995 91177308-0d34-0410-b5e6-96231b3b80d8
1) make sure that the first two instructions of the sequence cannot
separate from each other. The linker requires that they be sequential.
If they get separated, it can still work but it will not work in all
cases because the first of the instructions mostly involves the hi part
of the pc relative offset and that part changes slowly. You would have
to be at the right boundary for this to matter.
2) make sure that this sequence begins on a longword boundary.
There appears to be a bug in binutils which makes some of these calculations
get messed up if the instruction sequence does not begin on a longword
boundary. This is being investigated with the appropriate binutils folks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190966 91177308-0d34-0410-b5e6-96231b3b80d8
XCore target: Add XCoreTargetTransformInfo
This is where getNumberOfRegisters() resides, which in turn returns the
number of vector registers (=0).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190936 91177308-0d34-0410-b5e6-96231b3b80d8
For some reason I never got around to adding these at the same time as
the signed versions. No idea why.
I'm not sure whether this SystemZII::BranchC* stuff is useful, or whether
it should just be replaced with an "is normal" flag. I'll leave that
for later though.
There are some boundary conditions that can be tweaked, such as preferring
unsigned comparisons for equality with [128, 256), and "<= 255" over "< 256",
but again I'll leave those for a separate patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190930 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We indicate that the object files are safe by emitting a @feat.00
absolute address symbol. The address is presumably interpreted as a
bitfield of features that the compiler would like to enable. Bit 0 is
documented in the PE COFF spec to opt in to "registered SEH", which is
what /safeseh enables.
LLVM's object files are safe by default because LLVM doesn't know how to
produce SEH handlers.
Reviewers: Bigcheese
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D1691
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190898 91177308-0d34-0410-b5e6-96231b3b80d8
Documenting a design choice to generate only medium model sequences for TLS
addresses at this time. Small and large code models could be supported if
necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190883 91177308-0d34-0410-b5e6-96231b3b80d8
Large code model on PPC64 requires creating and referencing TOC entries when
using the addis/ld form of addressing. This was not being done in all cases.
The changes in this patch to PPCAsmPrinter::EmitInstruction() fix this. Two
test cases are also modified to reflect this requirement.
Fast-isel was not creating correct code for loading floating-point constants
using large code model. This also requires the addis/ld form of addressing.
Previously we were using the addis/lfd shortcut which is only applicable to
medium code model. One test case is modified to reflect this requirement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190882 91177308-0d34-0410-b5e6-96231b3b80d8
Add llvm.x86.* intrinsics for all of the Intel SHA Extensions instructions, as
well as tests. Also remove mayLoad and hasSideEffects, which can be inferred
from the instruction patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190864 91177308-0d34-0410-b5e6-96231b3b80d8
Fast-isel generates a COPY_TO_REGCLASS for widening f32 to f64, which
is a nop on PPC64. This is needed to keep the register class system
happy, but on the fast-isel path it is not removed before emit as it
is for DAG select. Ignore this op when emitting instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190795 91177308-0d34-0410-b5e6-96231b3b80d8
The port originally had special patterns for extload, mapping them to the
same instructions as sextload. It seemed neater to have patterns that
match "an extension that is allowed to be signed" and "an extension that
is allowed to be unsigned".
This was originally meant to be a clean-up, but it does improve the handling
of promoted integers a little, as shown by args-06.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190777 91177308-0d34-0410-b5e6-96231b3b80d8
This is a re-commit of r190764, with an extra check to make sure that we're not
performing the transformation on illegal types (a small test case has been
added for this as well).
Original commit message:
The PPC backend uses a target-specific DAG combine to turn unaligned Altivec
loads into a permutation-based sequence when possible. Unfortunately, the
target-specific DAG combine is not always called on all loads of interest
(sometimes the routines in DAGCombine call CombineTo such that the new node and
users are not added to the worklist); allowing the combine to trigger early
(before type legalization) mitigates this problem. Because the autovectorizers
only create legal vector types, I don't expect a lot of cases where this
optimization is enabled by type legalization in practice.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190771 91177308-0d34-0410-b5e6-96231b3b80d8
This is causing test-suite failures.
Original commit message:
The PPC backend uses a target-specific DAG combine to turn unaligned Altivec
loads into a permutation-based sequence when possible. Unfortunately, the
target-specific DAG combine is not always called on all loads of interest
(sometimes the routines in DAGCombine call CombineTo such that the new node and
users are not added to the worklist); allowing the combine to trigger early
(before type legalization) mitigates this problem. Because the autovectorizers
only create legal vector types, I don't expect a lot of cases where this
optimization is enabled by type legalization in practice.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190765 91177308-0d34-0410-b5e6-96231b3b80d8
The PPC backend uses a target-specific DAG combine to turn unaligned Altivec
loads into a permutation-based sequence when possible. Unfortunately, the
target-specific DAG combine is not always called on all loads of interest
(sometimes the routines in DAGCombine call CombineTo such that the new node and
users are not added to the worklist); allowing the combine to trigger early
(before type legalization) mitigates this problem. Because the autovectorizers
only create legal vector types, I don't expect a lot of cases where this
optimization is enabled by type legalization in practice.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190764 91177308-0d34-0410-b5e6-96231b3b80d8