Summary:
Add a hook for simplification of shufflevector's with the following rules:
- Constant folding - NFC, as it was already being done by the default handler.
- If only one of the operands is constant, constant fold the shuffle if the
mask does not select elements from the variable operand - to show the hook is firing and affecting the test-cases.
Reviewers: RKSimon, craig.topper, spatel, sanjoy, nlopes, majnemer
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31525
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299393 91177308-0d34-0410-b5e6-96231b3b80d8
Dont emit Mapping symbols for sections that contain only data.
Summary:
Dont emit mapping symbols for sections that contain only data.
Reviewers: rengolin, weimingz, kparzysz, t.p.northover, peter.smith
Reviewed By: t.p.northover
Patched by Shankar Easwaran <shankare@codeaurora.org>
Subscribers: alekseyshl, t.p.northover, llvm-commits
Differential Revision: https://reviews.llvm.org/D30724
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299392 91177308-0d34-0410-b5e6-96231b3b80d8
It can be costly to transfer from the gprs to the xmm registers and can prevent loads merging.
This patch splits vXi16/vXi32/vXi64 BUILD_VECTORS that use the same operand in multiple elements into a BUILD_VECTOR with only a single insertion of each of those elements and then performs an unary shuffle to duplicate the values.
There are a couple of minor regressions this patch unearths due to some missing MOVDDUP/BROADCAST folds that I will address in a future patch.
Note: Now that vector shuffle lowering and combining is pretty good we should be reusing that instead of duplicating so much in LowerBUILD_VECTOR - this is the first of several patches to address this.
Differential Revision: https://reviews.llvm.org/D31373
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299387 91177308-0d34-0410-b5e6-96231b3b80d8
The x86_64 ABI requires that the stack is 16 byte aligned on function calls. Thus, the 8-byte error code, which is pushed by the CPU for certain exceptions, leads to a misaligned stack. This results in bugs such as Bug 26413, where misaligned movaps instructions are generated.
This commit fixes the misalignment by adjusting the stack pointer in these cases. The adjustment is done at the beginning of the prologue generation by subtracting another 8 bytes from the stack pointer. These additional bytes are popped again in the function epilogue.
Fixes Bug 26413
Patch by Philipp Oppermann.
Differential Revision: https://reviews.llvm.org/D30049
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299383 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Move the aarch64-type-promotion pass within the existing type promotion framework in CGP.
This change also support forking sexts when a new sext is required for promotion.
Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853.
Reviewers: jmolloy, mcrosier, javed.absar, qcolombet
Reviewed By: qcolombet
Subscribers: llvm-commits, aemerson, rengolin, mcrosier
Differential Revision: https://reviews.llvm.org/D28680
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299379 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r299047 which is incorrect because the
simplification may result in incorrect propogation of undefs to users of
the folded shuffle.
Thanks to Andrea Di Biagio for pointing this out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299368 91177308-0d34-0410-b5e6-96231b3b80d8
- we are now using immediate AsmOperands so that the range check functions are
tablegen'ed.
- Big bonus is that error messages become much more accurate, i.e. instead of a
useless "invalid operand" error message it will not say that the immediate
operand must in range [x,y], which is why regression tests needed updating.
More tablegen operand descriptions could probably benefit from using
immediateAsmOperand, but this is a first good step to get rid of most of the
nearly identical range check functions. I will address the remaining immediate
operands in next clean ups.
Differential Revision: https://reviews.llvm.org/D31333
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299358 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Depends on D30928.
This adds support for coercion of stores and memory instructions that do not require insertion to process.
Another few tests down.
I added the relevant tests from rle.ll
Reviewers: davide
Subscribers: llvm-commits, Prazek
Differential Revision: https://reviews.llvm.org/D30929
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299330 91177308-0d34-0410-b5e6-96231b3b80d8
Disable bypassing if one of the operands looks like a hash value. Slow
division often occurs in hashtable implementations and fast division is
never taken there because a hash value is extremely unlikely to have
enough upper bits set to zero.
A value is considered to be hash-like if it is produced by
1) XOR operation
2) Multiplication by a constant wider than the shorter type
3) PHI node with all incoming values being hash-like
Differential Revision: https://reviews.llvm.org/D28200
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299329 91177308-0d34-0410-b5e6-96231b3b80d8
The code already allowed vector types in via "isInteger" (which might want
a more specific name), so use splat-friendly constant predicates to match
those types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299304 91177308-0d34-0410-b5e6-96231b3b80d8
This can only happen when we have a mix of zero and undef elements and the two vectors have a different arrangement of zeros/undefs. The shuffle should eventually be constant folded to all zeros.
Fixes PR32484.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299291 91177308-0d34-0410-b5e6-96231b3b80d8
REG_SEQUENCE falls into the same category as COPY for operands mapping:
- They don't have MCInstrDesc with register constraints
- The input variable could use whatever register classes
- It is possible to have register class already assigned to the operands
In particular, given REG_SEQUENCE are always target specific because of
the subreg indices. Those indices must apply to the register class of
the definition of the REG_SEQUENCE and therefore, the target must set a
register class to that definition. As a result, the generic code can
always use that register class to derive a valid mapping for a
REG_SEQUENCE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299285 91177308-0d34-0410-b5e6-96231b3b80d8
This patch refactors the code used in llc such that all the users of the
addPassesToEmitFile API have access to a homogeneous way of handling
start/stop-after/before options right out of the box.
Previously each user would have needed to duplicate this logic and set
up its own options.
NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299282 91177308-0d34-0410-b5e6-96231b3b80d8
This is helpful when extracting objects from archives produced by MSVC's
lib.exe, which users absolute paths to describe the archive members.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299264 91177308-0d34-0410-b5e6-96231b3b80d8
(and (setlt X, 0), (setlt Y, 0)) --> (setlt (and X, Y), 0)
We have 7 similar folds, but this one got away. The fact that the
x86 test with a branch didn't change is probably a separate bug. We
may also be missing this and the related folds in instcombine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299252 91177308-0d34-0410-b5e6-96231b3b80d8
A common way to implement nearbyint is by fiddling with the floating
point environment and calling rint. This is used at least by the BSD
libm and musl. As such, canonicalizing the latter to the former will
create infinite loops for libm and generally pessimize performance, at
least when the generic C versions are used.
This change preserves the rint in the libcall translation and also
handles the domain truncation logic, so that rint with float argument
will be reduced to rintf etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299247 91177308-0d34-0410-b5e6-96231b3b80d8
Add a new node to act as a fancy bitcast from f16 operations to
i32 that implicitly zero the high 16-bits of the result.
Alternatively could try making v2f16 legal and canonicalizing
on build_vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299246 91177308-0d34-0410-b5e6-96231b3b80d8
These are the same tests added for x86 with r299238,
but PPC doesn't specify all branches as cheap, so we
see different patterns in tests with branches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299244 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This feature enables folding of logical shift operations of up to 3 places into addressing mode on Kryo and Falkor that have a fastpath LSL.
Reviewers: mcrosier, rengolin, t.p.northover
Subscribers: junbuml, gberry, llvm-commits, aemerson
Differential Revision: https://reviews.llvm.org/D31113
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299240 91177308-0d34-0410-b5e6-96231b3b80d8
This wasn't covering for the case where you have multiple latches and hence
the use of the same loop-id which needs to be mapped to the same loop-id.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299237 91177308-0d34-0410-b5e6-96231b3b80d8
Our _MM_HINT_T0/T1 constant values are 3/2 which matches gcc, but not icc or Intel documentation. Interestingly gcc had this same bug on their implementation of the gather/scatter builtins at one point too.
Fixes PR32411.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299234 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Currently the VP metadata was dropped when InstCombine converts a call to direct call. This patch converts the VP metadata to branch_weights so that its hotness is recorded.
Reviewers: eraman, davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31344
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299228 91177308-0d34-0410-b5e6-96231b3b80d8