1130 Commits

Author SHA1 Message Date
Craig Topper
ee18eb90ff [AVX-512] Fix accidental uses of AH/BH/CH/DH after copies to/from mask registers
We've had several bugs(PR32256, PR32241) recently that resulted from usages of AH/BH/CH/DH either before or after a copy to/from a mask register.

This ultimately occurs because we create COPY_TO_REGCLASS with VK1 and GR8. Then in CopyToFromAsymmetricReg in X86InstrInfo we find a 32-bit super register for the GR8 to emit the KMOV with. But as these tests are demonstrating, its possible for the GR8 register to be a high register and we end up doing an accidental extra or insert from bits 15:8.

I think the best way forward is to stop making copies directly between mask registers and GR8/GR16. Instead I think we should restrict to only copies between mask registers and GR32/GR64 and use EXTRACT_SUBREG/INSERT_SUBREG to handle the conversion from GR32 to GR16/8 or vice versa.

Unfortunately, this complicates fastisel a bit more now to create the subreg extracts where we used to create GR8 copies. We can probably make a helper function to bring down the repitition.

This does result in KMOVD being used for copies when BWI is available because we don't know the original mask register size. This caused a lot of deltas on tests because we have to split the checks for KMOVD vs KMOVW based on BWI.

Differential Revision: https://reviews.llvm.org/D30968



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298928 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-28 16:35:29 +00:00
Matthias Braun
76900ddfb6 ExecutionDepsFix: Normalize names; NFC
Normalize ExeDepsFix, execution-fix, ExecutionDependencyFix and
ExecutionDepsFix to the last one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298183 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-18 05:05:40 +00:00
Matthias Braun
e00719deb7 TargetInstrInfo: Provide default implementation of isTailCall().
In fact this default implementation should be the only implementation,
keep it virtual for now to accomodate targets that don't model flags
correctly.

Differential Revision: https://reviews.llvm.org/D30747

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297980 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-16 20:02:30 +00:00
Jessica Paquette
af024ba867 [Outliner] Add tail call support
This commit adds tail call support to the MachineOutliner pass. This allows
the outliner to insert jumps rather than calls in areas where tail calling is
possible. Outlined tail calls include the return or terminator of the basic
block being outlined from.

Tail call support allows the outliner to take returns and terminators into
consideration while finding candidates to outline. It also allows the outliner
to save more instructions. For example, in the X86-64 outliner, a tail called
outlined function saves one instruction since no return has to be inserted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297653 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-13 18:39:33 +00:00
Craig Topper
2e124a6c7c [AVX-512] Fix a bad use of a high GR8 register after copying from a mask register during fast isel. This ends up extracting from bits 15:8 instead of the lower bits of the mask.
I'm pretty sure there are more problems lurking here. But I think this fixes PR32241.

I've added the test case from that bug and added asserts that will fail if we ever try to copy between high registers and mask registers again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297574 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-12 03:37:37 +00:00
Jessica Paquette
d43adee378 [Outliner] Fixed Asan bot failure in r296418
Fixed the asan bot failure which led to the last commit of the outliner being reverted.
The change is in lib/CodeGen/MachineOutliner.cpp in the SuffixTree's constructor. LeafVector
is no longer initialized using reserve but just a standard constructor.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297081 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-06 21:31:18 +00:00
Matthias Braun
73ddbb7dff Revert "Add MIR-level outlining pass"
Revert Machine Outliner for now, as it breaks the asan bot.

This reverts commit r296418.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296426 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 02:24:30 +00:00
Matthias Braun
c043a889f1 Add MIR-level outlining pass
This is a patch for the outliner described in the RFC at:
http://lists.llvm.org/pipermail/llvm-dev/2016-August/104170.html

The outliner is a code-size reduction pass which works by finding
repeated sequences of instructions in a program, and replacing them with
calls to functions. This is useful to people working in low-memory
environments, where sacrificing performance for space is acceptable.

This adds an interprocedural outliner directly before printing assembly.
For reference on how this would work, this patch also includes X86
target hooks and an X86 test.

The outliner is run like so:

clang -mno-red-zone -mllvm -enable-machine-outliner file.c

Patch by Jessica Paquette<jpaquette@apple.com>!

rdar://29166825

Differential Revision: https://reviews.llvm.org/D26872

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296418 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 00:33:32 +00:00
Ayman Musa
6f30b9797e [X86][AVX] Disable VCVTSS2SD & VCVTSD2SS memory folding and fix the register class of their first input when creating node in fast-isel.
(Quick fix to buildbot failure after rL295940 commit).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295970 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 13:15:44 +00:00
Ayman Musa
ff35eecd7d [X86][AVX512] Remove VCVTSS2SDZ & VCVTSD2SSZ from memory folding tables as they introduce new read dependency when folding.
(Quick fix to buildbot fail). 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295946 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 08:13:36 +00:00
Ayman Musa
70ad23eba8 [X86][AVX512] Change VCVTSS2SD and VCVTSD2SS node types to keep consistency between VEX/EVEX versions.
AVX versions of the converts work on f32/f64 types, while AVX512 version work on vectors.

Differential Revision: https://reviews.llvm.org/D29988



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295940 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-23 07:24:21 +00:00
Craig Topper
9ca276657a [AVX-512] Add broadcast VPTERNLOG instructions to special case commuting switch.
The instructions are marked commutable, but without special handling we don't get the immediate correct.

While here also remove the masked memory forms that aren't commutable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295602 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-19 08:03:26 +00:00
Craig Topper
4deede4ddb [X86][XOP] Reduce the size of a multiclass by moving more stuff to parameters instead of doing 128-bit and 256-bit simultaneously.
This requires some instructions to be renamed to move the Y earlier in the instruction name. The new names are more consistent with other instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295579 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-18 22:53:43 +00:00
Craig Topper
58ee25f913 Recommit "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."
Clang has now been fixed to not use these intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295571 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-18 21:50:58 +00:00
Craig Topper
b3d03a9308 Revert "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."
This reverts r295564. I missed that clang was still using the intrinsics despite our half implemented autoupgrade support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295565 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-18 20:14:20 +00:00
Craig Topper
aa2f6f93a2 [X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR.
It seems we were already upgrading 128-bit VPCMOV, but the intrinsic was still defined and being used in isel patterns. While I was here I also simplified the tablegen multiclasses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295564 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-18 19:51:25 +00:00
Hans Wennborg
a8edb5cd90 Re-apply r282920 "X86: Allow conditional tail calls in Win64 "leaf" functions (PR26302)"
The original commit was reverted in r283329 due to a miscompile in
Chromium. That turned out to be the same issue as PR31257, which was
fixed in r295262.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295357 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-16 19:04:42 +00:00
Hans Wennborg
b6ae6ad928 [X86] Re-enable conditional tail calls and fix PR31257.
This reverts r294348, which removed support for conditional tail calls
due to the PR above. It fixes the PR by marking live registers as
implicitly used and defined by the now predicated tailcall. This is
similar to how IfConversion predicates instructions.

Differential Revision: https://reviews.llvm.org/D29856

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295262 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-16 00:04:05 +00:00
Craig Topper
fc3e843620 [AVX-512] Add PACKSS/PACKUS instructions to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295154 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-15 06:51:39 +00:00
Craig Topper
6383c00b0a [AVX-512] Add PAVGB/PAVGW to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295035 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-14 06:54:57 +00:00
Craig Topper
6954de742c [AVX-512] Add various EVEX move instructions to load folding tables using the VEX equivalents as a guide.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294908 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-12 18:47:46 +00:00
Craig Topper
82c3f60cd8 [AVX-512] Add VPEXTRD/Q to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294905 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-12 18:47:37 +00:00
Craig Topper
5bb68b46d1 [AVX-512] Add VPMINS/MINU/MAXS/MAXU instructions to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294858 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 17:35:28 +00:00
Craig Topper
b3ac0dcae6 [X86] Improve alphabetizing of load folding tables. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294857 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 17:35:25 +00:00
Simon Pilgrim
bff8d8792a [X86][3DNow!] Enable PFSUB<->PFSUBR commutation
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294847 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 13:51:14 +00:00
Craig Topper
02524a88e4 [AVX-512] Add VPINSRB/W/D/Q instructions to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294830 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 07:01:40 +00:00
Craig Topper
7334434419 [AVX-512] Add VPSADBW instructions to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294827 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 06:24:03 +00:00
Craig Topper
ff0f1ca865 [X86] Don't base domain decisions on VEXTRACTF128/VINSERTF128 if only AVX1 is available.
Seems the execution dependency pass likes to use FP instructions when most of the consuming code is integer if a vextractf128 instruction produced the register. Without AVX2 we don't have the corresponding integer instruction available.

This patch suppresses the domain on these instructions to GenericDomain if AVX2 is not supported so that they are ignored by domain fixing. If AVX2 is supported we'll report the correct domain and allow them to switch between integer and fp.

Overall I think this produces better results in the modified test cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294824 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-11 05:32:57 +00:00
Peter Collingbourne
ce00de82c2 X86: Teach X86InstrInfo::analyzeCompare to recognize compares of symbols.
This requires that we communicate to X86InstrInfo::optimizeCompareInstr
that the second operand is neither a register nor an immediate. The way we
do that is by setting CmpMask to zero.

Note that there were already instructions where the second operand was not a
register nor an immediate, namely X86::SUB*rm, so also set CmpMask to zero
for those instructions. This seems like a latent bug, but I was unable to
trigger it.

Differential Revision: https://reviews.llvm.org/D28621

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294634 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-09 21:58:24 +00:00
Hans Wennborg
34a6e0d36a [X86] Disable conditional tail calls (PR31257)
They are currently modelled incorrectly (as calls, which clobber
registers, confusing e.g. Machine Copy Propagation).

Reverting until we figure out the proper solution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294348 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-07 20:37:45 +00:00
Craig Topper
472303af3a [AVX-512] Add masked and unmasked shift by immediate instructions to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294287 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-07 07:31:00 +00:00
Craig Topper
78b86169a5 [AVX-512] Add masked shift instructions to load folding tables.
This adds the masked versions of everything, but the shift by immediate instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294286 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-07 07:30:57 +00:00
Craig Topper
6e1fdd37ce [AVX-512] Add some of the shift instructions to the load folding tables.
This includes unmasked forms of variable shift and shifting by the lower element of a register.

Still need to do shift by immediate which was not foldable prior to avx512 and all the masked forms.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294285 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-07 07:30:54 +00:00
Craig Topper
e7499853bf [AVX-512] Add VPSLLDQ/VPSRLDQ to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294170 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-06 05:12:14 +00:00
Craig Topper
a455c94818 [AVX-512] Add VPABSB/D/Q/W to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294169 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-06 03:18:01 +00:00
Craig Topper
1464e37f6d [AVX-512] Add VSHUFPS/PD to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294168 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-06 03:17:58 +00:00
Craig Topper
6ab67411fe [AVX-512] Add VPMULLD/Q/W instructions to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294164 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-06 01:19:26 +00:00
Craig Topper
cb98b65ac4 [AVX-512] Add all masked and unmasked versions of VPMULDQ and VPMULUDQ to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294163 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-05 23:31:48 +00:00
Craig Topper
053d7dd312 [AVX-512] Add scalar masked max/min intrinsic instructions to the load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294153 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-05 22:25:46 +00:00
Craig Topper
371f918f70 [AVX-512] Add scalar masked add/sub/mul/div intrinsic instructions to the load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294152 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-05 22:25:42 +00:00
Craig Topper
e91c6128e6 [AVX-512] Add masked scalar FMA intrinsics to isNonFoldablePartialRegisterLoad to improve load folding of scalar loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294151 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-05 22:25:40 +00:00
Evandro Menezes
20de1ea345 [CodeGen] Move MacroFusion to the target
This patch moves the class for scheduling adjacent instructions,
MacroFusion, to the target.

In AArch64, it also expands the fusion to all instructions pairs in a
scheduling block, beyond just among the predecessors of the branch at the
end.

Differential revision: https://reviews.llvm.org/D28489

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293737 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-01 02:54:34 +00:00
Craig Topper
5f528dcc0a [AVX-512] Don't both looking into the AVX512DQ execution domain fixing tables if AVX512DQ isn't supported since we can't do any conversion anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293608 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 06:49:55 +00:00
Craig Topper
25b197c99d [X86] Add AVX and SSE2 version of MOVSDmr to execution domain fixing table. AVX-512 already did this for the EVEX version.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293607 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 06:49:53 +00:00
Craig Topper
bde6e7c9de [AVX-512] Fix copy and paste bug in execution domain fixing tables so that we can convert 256-bit movnt instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293606 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-31 06:49:50 +00:00
Craig Topper
dd23d7ede6 [AVX-512] Remove duplicate CodeGenOnly patterns for scalar register broadcast. We can use COPY_TO_REGCLASS like AVX does.
This causes stack spill slots be oversized sometimes, but the same should already be happening with AVX.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293464 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 06:59:06 +00:00
Craig Topper
8b0f69514c [AVX-512] Remove KSET0B/KSET1B in favor of the patterns that select KSET0W/KSET1W for v8i1.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293458 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-30 05:37:47 +00:00
Craig Topper
7b47370e8e [AVX-512] Teach two address instruction pass to replace masked move instructions with blendm instructions when its beneficial.
Isel now selects masked move instructions for vselect instead of blendm. But sometimes it beneficial to register allocation to remove the tied register constraint by using blendm instructions.

This also picks up cases where the masked move was created due to a masked load intrinsic.

Differential Revision: https://reviews.llvm.org/D28454

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292005 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-14 07:50:52 +00:00
Craig Topper
49a15c1e8e [AVX-512] Replace V_SET0 in AVX-512 patterns with AVX512_128_SET0. Enhance AVX512_128_SET0 expansion to make this possible.
We'll now expand AVX512_128_SET0 to an EVEX VXORD if VLX available. Or if its not, but register allocation has selected a non-extended register we will use VEX VXORPS. And if its an extended register without VLX we'll use a 512-bit XOR. Do the same for AVX512_FsFLD0SS/SD.

This makes it possible for the register allocator to have all 32 registers available to work with.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292004 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-14 07:29:24 +00:00
Diana Picus
8a47810cd6 [CodeGen] Rename MachineInstrBuilder::addOperand. NFC
Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.

See https://reviews.llvm.org/D28057 for the whole discussion.

Differential Revision: https://reviews.llvm.org/D28556

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291891 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-13 09:58:52 +00:00