Commit Graph

109149 Commits

Author SHA1 Message Date
Hiroshi Inoue
f8bf5c2a17 [SROA] Disable non-whole-alloca splits by default
This patch introduce a switch to control splitting of non-whole-alloca slices with default off.
The switch will be default on again after fixing an issue reported in PR35657.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320958 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18 06:47:37 +00:00
Craig Topper
a9e5853a21 [X86] Fix mistake that I made when splitting up the setOperationAction calls recently.
The block I moved things that need BWI and 512-bit or VLX is incorrectly qualified with just hasBWI || hasVLX. Here I've qualified it with hasBWI && (hasAVX512 || hasVLX) where the hasAVX512 will be replaced with allowing 512-bit vectors in an upcoming patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320957 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18 04:50:05 +00:00
Serguei Katkov
5f348c6a0c [CGP] Fix the handling select inst in complex addressing mode
When we put the value in select placeholder we must pass
the value through simplification tracker due to the value might
be already simplified and erased.

This is a fix for PR35658.

Reviewers: john.brawn, uabelho
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D41251


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320956 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-18 04:25:07 +00:00
Bjorn Steinbrink
53f8289df4 Re-commit "Properly handle multi-element and dynamically sized allocas in getPointerDereferenceableBytes()""
llvm-clang-x86_64-expensive-checks-win is still broken, so the failure
seems unrelated.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320953 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 21:20:16 +00:00
Craig Topper
aa22588b15 [X86] Make the code that creates fmaddsub from build_vector of extracts and inserts functional and add tests.
Summary:
We had no tests for this and we couldn't do the optimization because of a bad use count check. We need to know how many non-undef pieces of the build vector were filled in and ensure our use count is equal to that. But on the shuffle combine version we need the use count to be 2.

The missing coverage was noticed during the review of D40335.

Reviewers: RKSimon, zvi, spatel

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41133

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320950 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 18:23:45 +00:00
Sam Clegg
6e6de69a0f [WebAssembly] Export some more info on wasm funtions
Summary:
These fields are useful for lld's gc-sections support

Also remove an unused field.

Subscribers: jfb, dschuff, jgravelle-google, aheejin, sunfish

Differential Revision: https://reviews.llvm.org/D41320

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320946 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 17:50:07 +00:00
Bjorn Steinbrink
ce542fd0ce Revert "Properly handle multi-element and dynamically sized allocas in getPointerDereferenceableBytes()"
This reverts commit 217067d517.

Fails on llvm-clang-x86_64-expensive-checks-win

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320945 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 15:16:58 +00:00
Bjorn Steinbrink
7bf95b403d Revert "Treat sret arguments as being dereferenceable in getPointerDereferenceableBytes()"
This reverts commit 8b7a7660a3.

I didn't mean to commit this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320944 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 15:16:51 +00:00
Bjorn Steinbrink
8b7a7660a3 Treat sret arguments as being dereferenceable in getPointerDereferenceableBytes()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320943 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 15:11:52 +00:00
Simon Pilgrim
4521272d73 Remove superfluous break after a return. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320941 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 11:01:33 +00:00
Craig Topper
eb51e213d1 [X86DomainReassignment] Store legal domains in a std::bitset instead of using a SmallVector that really only ever has one element as a set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320940 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 03:16:23 +00:00
Bjorn Steinbrink
b2ce483243 Properly handle byval arguments in getPointerDereferenceableBytes()
Summary:
For byval arguments, the number of dereferenceable bytes is equal to
the size of the pointee, not the pointer.

Reviewers: hfinkel, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41305

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320939 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 02:37:42 +00:00
Bjorn Steinbrink
217067d517 Properly handle multi-element and dynamically sized allocas in getPointerDereferenceableBytes()
Reviewers: hfinkel, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D41288

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320938 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 01:54:25 +00:00
Craig Topper
43beef6ca6 [X86] Use extract_vector_elt instead of X86ISD::VEXTRACT for isel of vXi1 extractions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320937 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 01:35:48 +00:00
Craig Topper
036a90a54c [X86] Canonicalize extract_vector_elt from vXi1 to always return MVT::i32.
This allows us to remove some isel patterns that allowed MVT::i8 result type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320936 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 01:35:47 +00:00
Craig Topper
e03e617120 [X86] Don't create X86ISD::VEXTRACT nodes directly. Use EXTRACT_VECTOR_ELT and allow that to be legaized to VEXTRACT.
I think we can remove the VEXTRACT node completely and use a canonicalized EXTRACT_VECTOR_ELT instead. This is a first step.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320935 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-17 01:35:44 +00:00
Simon Pilgrim
c61df73fcb Fix unused variable warning.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320934 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 23:37:51 +00:00
Simon Pilgrim
9974ce94da [X86][AVX] lowerVectorShuffleAsBroadcast - aggressively peek through BITCASTs
Assuming we can safely adjust the broadcast index for the new type to keep it suitably aligned, then peek through BITCASTs when looking for the broadcast source.

Fixes PR32007

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320933 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 23:32:18 +00:00
Simon Pilgrim
ebebd80fc4 [X86][AVX] Use extract128BitVector helper. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320932 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 23:09:57 +00:00
Simon Pilgrim
e15bb62995 [X86][AVX] Fix failed broadcast fold
Strip excess BITCASTs from EXTRACT_SUBVECTOR input

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320930 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 22:57:17 +00:00
Sean Fertile
a9c4e7f664 [Memcpy Loop Lowering] Only calculate residual size/bytes copied when needed.
If the loop operand type is int8 then there will be no residual loop for the
unknown size expansion. Dont create the residual-size and bytes-copied values
when they are not needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320929 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 22:41:39 +00:00
Craig Topper
8923be6680 [X86] Don't pass a zero input to the passthru operand of getVectorMaskingNode/getScalarMaskingNode when its going to emit an ISD::OR/ISD::AND. NFCI
In those cases, the pass thru operand of the methods isn't used. The calls to the scalar version were passing a MVT::i1 zero, which is an illegal type at the stage this code runs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320928 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 21:12:24 +00:00
Craig Topper
ab97c302b2 [X86] Have getVectorMaskingNode return an ISD::AND for X86ISD::VPSHUFBITQMB instead of creating a select with one input being 0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320927 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 21:12:23 +00:00
Craig Topper
be2953e6e7 [X86] When using vpopcntdq for ctpop of v8i16 vectors, only promote to v8i32.
Previously we promoted to v8i64, but we don't need to go all the way to 512-bits. If we have VLX we can use the 256-bit instruction. And even if we don't have VLX we can widen v8i32 to v16i32 and drop the upper half.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320926 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 19:31:36 +00:00
Craig Topper
2ab19173fc [X86] Combine some more scheduler model entries using regular expressions.
We had a lot of separate 32 and 64 instructions that had the same scheduling data. This merges them into the same regular expression. This is pretty consistent with a lot of other instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320924 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 18:35:31 +00:00
Craig Topper
9d5f0f873a [X86] Use instrs instead of instregex for gather/scatter instructions in the scheduler models. Combine into single InstrRW entries.
The reduces the number of scheduler groups in subtarget info.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320923 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 18:35:29 +00:00
Sanjay Patel
ccf3928623 [InstCombine] canonicalize shifty abs(): ashr+add+xor --> cmp+neg+sel
We want to do this for 2 reasons:
1. Value tracking does not recognize the ashr variant, so it would fail to match for cases like D39766.
2. DAGCombiner does better at producing optimal codegen when we have the cmp+sel pattern.

More detail about what happens in the backend:
1. DAGCombiner has a generic transform for all targets to convert the scalar cmp+sel variant of abs 
   into the shift variant. That is the opposite of this IR canonicalization.
2. DAGCombiner has a generic transform for all targets to convert the vector cmp+sel variant of abs 
   into either an ABS node or the shift variant. That is again the opposite of this IR canonicalization.
3. DAGCombiner has a generic transform for all targets to convert the exact shift variants produced by #1 or #2
   into an ISD::ABS node. Note: It would be an efficiency improvement if we had #1 go directly to an ABS node 
   when that's legal/custom.
4. The pattern matching above is incomplete, so it is possible to escape the intended/optimal codegen in a 
   variety of ways.
   a. For #2, the vector path is missing the case for setlt with a '1' constant.
   b. For #3, we are missing a match for commuted versions of the shift variants.
5. Therefore, this IR canonicalization can only help get us to the optimal codegen. The version of cmp+sel 
   produced by this patch will be recognized in the DAG and converted to an ABS node when possible or the 
   shift sequence when not.
6. In the following examples with this patch applied, we may get conditional moves rather than the shift 
   produced by the generic DAGCombiner transforms. The conditional move is created using a target-specific 
   decision for any given target. Whether it is optimal or not for a particular subtarget may be up for debate.

define i32 @abs_shifty(i32 %x) {
  %signbit = ashr i32 %x, 31 
  %add = add i32 %signbit, %x  
  %abs = xor i32 %signbit, %add 
  ret i32 %abs
}

define i32 @abs_cmpsubsel(i32 %x) {
  %cmp = icmp slt i32 %x, zeroinitializer
  %sub = sub i32 zeroinitializer, %x
  %abs = select i1 %cmp, i32 %sub, i32 %x
  ret i32 %abs
}

define <4 x i32> @abs_shifty_vec(<4 x i32> %x) {
  %signbit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> 
  %add = add <4 x i32> %signbit, %x  
  %abs = xor <4 x i32> %signbit, %add 
  ret <4 x i32> %abs
}

define <4 x i32> @abs_cmpsubsel_vec(<4 x i32> %x) {
  %cmp = icmp slt <4 x i32> %x, zeroinitializer
  %sub = sub <4 x i32> zeroinitializer, %x
  %abs = select <4 x i1> %cmp, <4 x i32> %sub, <4 x i32> %x
  ret <4 x i32> %abs
}

> $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=x86_64 -mattr=avx 
> abs_shifty:
> 	movl	%edi, %eax
> 	negl	%eax
> 	cmovll	%edi, %eax
> 	retq
> 
> abs_cmpsubsel:
> 	movl	%edi, %eax
> 	negl	%eax
> 	cmovll	%edi, %eax
> 	retq
> 
> abs_shifty_vec:
> 	vpabsd	%xmm0, %xmm0
> 	retq
> 
> abs_cmpsubsel_vec:
> 	vpabsd	%xmm0, %xmm0
> 	retq
> 
> $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=aarch64
> abs_shifty:
> 	cmp	w0, #0                  // =0
> 	cneg	w0, w0, mi
> 	ret
> 
> abs_cmpsubsel: 
> 	cmp	w0, #0                  // =0
> 	cneg	w0, w0, mi
> 	ret
>                                        
> abs_shifty_vec: 
> 	abs	v0.4s, v0.4s
> 	ret
> 
> abs_cmpsubsel_vec: 
> 	abs	v0.4s, v0.4s
> 	ret
> 
> $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=powerpc64le 
> abs_shifty:  
> 	srawi 4, 3, 31
> 	add 3, 3, 4
> 	xor 3, 3, 4
> 	blr
> 
> abs_cmpsubsel:
> 	srawi 4, 3, 31
> 	add 3, 3, 4
> 	xor 3, 3, 4
> 	blr
> 
> abs_shifty_vec:   
> 	vspltisw 3, -16
> 	vspltisw 4, 15
> 	vsubuwm 3, 4, 3
> 	vsraw 3, 2, 3
> 	vadduwm 2, 2, 3
> 	xxlxor 34, 34, 35
> 	blr
> 
> abs_cmpsubsel_vec: 
> 	vspltisw 3, -16
> 	vspltisw 4, 15
> 	vsubuwm 3, 4, 3
> 	vsraw 3, 2, 3
> 	vadduwm 2, 2, 3
> 	xxlxor 34, 34, 35
> 	blr
>

Differential Revision: https://reviews.llvm.org/D40984





git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320921 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 16:41:17 +00:00
Craig Topper
53c6a87b9f [X86] Remove unneeded code for handling the old kunpck intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320917 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 06:58:30 +00:00
Hal Finkel
3f92210a79 [LV] Extend InstWidening with CM_Widen_Recursive
Changes to the original scalar loop during LV code gen cause the return value
of Legal->isConsecutivePtr() to be inconsistent with the return value during
legal/cost phases (further analysis and information of the bug is in D39346).
This patch is an alternative fix to PR34965 following the CM_Widen approach
proposed by Ayal and Gil in D39346. It extends InstWidening enum with
CM_Widen_Reverse to properly record the widening decision for consecutive
reverse memory accesses and, consequently, get rid of the
Legal->isConsetuviePtr() call in LV code gen. I think this is a simpler/cleaner
solution to PR34965 than the one in D39346.

Fixes PR34965.

Patch by Diego Caballero, thanks!

Differential Revision: https://reviews.llvm.org/D40742

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320913 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 02:55:24 +00:00
Hal Finkel
4f44d46023 [PowerPC, AsmParser] Enable the mnemonic spell corrector
r307148 added an assembly mnemonic spelling correction support and enabled it
on ARM. This enables that support on PowerPC as well.

Patch by Dmitry Venikov, thanks!

Differential Revision: https://reviews.llvm.org/D40552

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320911 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 02:42:18 +00:00
Craig Topper
196a560857 [X86] Add 128 and 256-bit VPOPCNTDQ instructions. Adjust some tablegen classes LZCNT/POPCNT.
I think when this instruction was first published it was only for a Knights CPU and thus VLX version was missing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320910 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 02:40:28 +00:00
Vitaly Buka
ef76fcda6e [LTO] Make processing of combined module more consistent
Summary:
1. Use stream 0 only for combined module. Previously if combined module was not
processes ThinLTO used the stream for own output. However small changes in input,
could trigger combined module  and shuffle outputs making life of llvm::LTO harder.

2. Always process combined module and write output to stream 0. Processing empty
combined module is cheap and allows llvm::LTO users to avoid implementing processing
which is already done in llvm::LTO.

Subscribers: mehdi_amini, inglorion, eraman, hiraditya

Differential Revision: https://reviews.llvm.org/D41267

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320905 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 02:10:00 +00:00
Hal Finkel
1d4f2b0d25 [SimplifyLibCalls] Inline calls to cabs when it's safe to do so
When unsafe algerbra is allowed calls to cabs(r) can be replaced by:

  sqrt(creal(r)*creal(r) + cimag(r)*cimag(r))

Patch by Paul Walker, thanks!

Differential Revision: https://reviews.llvm.org/D40069

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320901 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 01:26:25 +00:00
Hal Finkel
625a9ef4f3 [LV] NFC patch for moving VP*Recipe class definitions from LoopVectorize.cpp to VPlan.h
This is a small step forward to move VPlan stuff to where it should belong (i.e., VPlan.*):

  1. VP*Recipe classes in LoopVectorize.cpp are moved to VPlan.h.
  2. Many of VP*Recipe::print() and execute() definitions are still left in
     LoopVectorize.cpp since they refer to things declared in LoopVectorize.cpp. To
     be moved to VPlan.cpp at a later time.
  3. InterleaveGroup class is moved from anonymous namespace to llvm namespace.
     Referencing it in anonymous namespace from VPlan.h ended up in warning.

Patch by Hideki Saito, thanks!

Differential Revision: https://reviews.llvm.org/D41045

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320900 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 01:12:50 +00:00
Craig Topper
fbba83deb2 [X86] Add back the assert from r320830 that was reverted in r320850
Hopefully r320864 has fixed the offending case that failed the assert.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320898 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 00:33:16 +00:00
Teresa Johnson
a48c4cca96 Fix NDEBUG build problem in r320895
Fix incorrect placement of #endif causing NDEBUG build failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320897 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 00:29:31 +00:00
Teresa Johnson
2140d926da [ThinLTO] Enable importing of aliases as copy of aliasee
Summary:
This implements a missing feature to allow importing of aliases, which
was previously disabled because alias cannot be available_externally.
We instead import an alias as a copy of its aliasee.

Some additional work was required in the IndexBitcodeWriter for the
distributed build case, to ensure that the aliasee has a value id
in the distributed index file (i.e. even when it is not being
imported directly).

This is a performance win in codes that have many aliases, e.g. C++
applications that have many constructor and destructor aliases.

Reviewers: pcc

Subscribers: mehdi_amini, inglorion, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D40747

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320895 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-16 00:18:12 +00:00
David Blaikie
2d82935d1a Fix WebAssembly backend for some LLVM API changes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320893 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 23:52:06 +00:00
Paul Robinson
87764e1e35 Revert "Recommit "[DWARFv5] Dump an MD5 checksum in the line-table header.""
This reverts commit 0afef672f63f0e4e91938656bc73424a8c058bfc.
Still failing at runtime on bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320888 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 23:21:52 +00:00
Paul Robinson
bee91d7634 Recommit "[DWARFv5] Dump an MD5 checksum in the line-table header."
Adds missing support for DW_FORM_data16.

Update of r320852, fixing the unittest to use a hand-coded struct
instead of std::array to guarantee data layout.

Differential Revision: https://reviews.llvm.org/D41090

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320886 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 22:57:17 +00:00
Matthias Braun
3587705651 Fix unused variable in non-assert builds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320885 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 22:53:33 +00:00
Matthias Braun
d318139827 MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320884 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 22:22:58 +00:00
Matthias Braun
dfcb4f5344 MachineFunction: Slight refactoring; NFC
Slight cleanup/refactor in preparation for upcoming commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320882 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 22:22:46 +00:00
Galina Kistanova
4e7906d9cd Fixed the gcc 'enumeral and non-enumeral type in conditional expression [-Werror=extra]' warning introduced by r320750
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320868 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 22:15:29 +00:00
Krzysztof Parzyszek
4d6de6f6af [Hexagon] Remove recursion in visitUsesOf, replace with use queue
This is primarily to reduce stack usage, but ordering the use queue
according to the position in the code (earlier instructions visited
before later ones) reduces the number of unnecessary bottoms due to
visiting instructions out of order, e.g.
  %reg1 = copy %reg0
  %reg2 = copy %reg0
  %reg3 = and %reg1, %reg2
Here, reg3 should be known to be same as reg0-2, but if reg3 is
evaluated after reg1 is updated, but before reg2 is updated, the two
inputs to the and will appear different, causing reg3 to become
bottom.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320866 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 21:34:05 +00:00
Krzysztof Parzyszek
a211d55b2f [Hexagon] Handle concat_vectors of all allowed HVX types
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320865 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 21:23:12 +00:00
Craig Topper
2dccaf4e16 [X86] Use AND32ri8 instead of AND64ri8 in Asan code in EmitCallAsanReport for 32-bit mode.
This seemed to work due to a quirk in the X86 MC encoder that didn't emit a REX byte that the AND64ri8 implies when in 32-bit mode. This made the encoding the same as AND32ri8. I tried to add an assert to catch the dropped REX prefix that caught this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320864 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 21:18:06 +00:00
Craig Topper
4c29a62efa [X86] In LowerVectorCTPOP use ISD::ZERO_EXTEND/ISD::TRUNCATE instead of the target specific nodes.
The target independent nodes will get legalized to the target specific nodes by their own legalization process. Someday I'd like to stop using a target specific for zero extends and truncates of legal types so the less places we reference the target specific opcode the better.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320863 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 21:18:05 +00:00
Craig Topper
6be8b9966d [X86] Remove unnecessary TODO.
When I wrote it I thought we were missing a potential optimization for KNL. But investigating further shows that for KNL we still do the optimal thing by widening to v4f32 and then using special isel patterns to widen again to zmm a register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320862 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 20:57:18 +00:00
Jun Bum Lim
d154dd9bb4 Re-commit : [LICM] Allow sinking when foldable in loop
This recommits r320823 reverted due to the test failure in sink-foldable.ll and
an unused variable. Added "REQUIRES: aarch64-registered-target" in the test
and removed unused variable.

Original commit message:

  Continue trying to sink an instruction if its users in the loop is foldable.
  This will allow the instruction to be folded in the loop by decoupling it from
  the user outside of the loop.

  Reviewers: hfinkel, majnemer, davidxl, efriedma, danielcdh, bmakam, mcrosier

  Reviewed By: hfinkel

  Subscribers: javed.absar, bmakam, mcrosier, llvm-commits

  Differential Revision: https://reviews.llvm.org/D37076

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320858 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 20:33:24 +00:00