Summary:
The 32-bit instructions don't zero the high 16-bits like the 16-bit
instructions do.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D26828
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287342 91177308-0d34-0410-b5e6-96231b3b80d8
insertUniqueBackedgeBlock in lib/Transforms/Utils/LoopSimplify.cpp now
propagates existing llvm.loop metadata to newly the added backedge.
llvm::TryToSimplifyUncondBranchFromEmptyBlock in lib/Transforms/Utils/Local.cpp
now propagates existing llvm.loop metadata to the branch instructions in the
predecessor blocks of the empty block that is removed.
Differential Revision: https://reviews.llvm.org/D26495
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287341 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The addr64-based legalization is incorrect for MUBUF instructions with idxen
set as well as for BUFFER_LOAD/STORE_FORMAT_* instructions. This affects
e.g. shaders that access buffer textures.
Since we never actually need the addr64-legalization in shaders, this patch
takes the easy route and keys off the calling convention. If this ever
affects (non-OpenGL) compute, the type of legalization needs to be chosen
based on some TSFlag.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98664
Reviewers: arsenm, tstellarAMD
Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D26747
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287339 91177308-0d34-0410-b5e6-96231b3b80d8
When we see a SETCC whose only users are zero extend operations, we can replace
it with a subtraction. This results in doing all calculations in GPRs and
avoids CR use.
Currently we do this only for ULT, ULE, UGT and UGE condition codes. There are
ways that this can be extended. For example for signed condition codes. In that
case we will be introducing additional sign extend instructions, so more careful
profitability analysis may be required.
Another direction to extend this is for equal, not equal conditions. Also when
users of SETCC are any_ext or sign_ext, we might be able to do something
similar.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287329 91177308-0d34-0410-b5e6-96231b3b80d8
This is a straightforward extension of the existing support for 32/64-bit element types. Just needed to add the additional instrinsics to the switches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287316 91177308-0d34-0410-b5e6-96231b3b80d8
The same thing was done to 32-bit and 64-bit element sizes previously.
This will allow us to support these shuffls in InstCombineCalls along with the other variable shift intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287312 91177308-0d34-0410-b5e6-96231b3b80d8
since bpf instruction set was introduced people learned to
read and understand kernel verifier output whereas llvm asm
output stayed obscure and unknown. Convert llvm to emit
assembler text similar to kernel to avoid this discrepancy
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287300 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This extends FCOPYSIGN support to 512-bit vectors.
I've also added tests to show what the 128-bit and 256-bit cases look like with broadcast loads.
Reviewers: delena, zvi, RKSimon, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26791
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287298 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This should provide the function similar to `--disable-libedit` with the autotools build system, which seems to be missing from the commit (r200595) that adds this.
Reviewers: pcc, beanz
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D26550
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287293 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further estimation. Recursion depth limit was also introduced as a parameter.
Reviewers: sanjoy
Subscribers: mzolotukhin, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D26389
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287232 91177308-0d34-0410-b5e6-96231b3b80d8
vXi64 multiplication is lowered into 3 calls of vpmuludq with the upper/lower 32-bit halves.
If any of these halves are zero then we can remove individual calls. Although there was isBuildVectorAllZeros code to do this I don't think it ever worked (maybe just for constant folded cases that don't seem to be tested for any longer).
This requires additional X86ISD support for computeKnownBitsForTargetNode, so far I've just added support for X86ISD::VZEXT (VPMOVZX* - helping the AVX2+ cases).
Partial fix for PR30845
Differential Revision: https://reviews.llvm.org/D26590
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287223 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The motivation for this is to enable correct detection of dlopen() on Android.
Android does not provide a static version of libdl, so if we add the -static flag
after performing the check, it will succeed even though subsequent link steps
will fail. With this change we correctly detect the absence of libdl in a
LLVM_BUILD_STATIC build on Android.
The link itself still does not succeed because the code does not check the result
of this check properly, but I plan to fix that in a separate change.
Reviewers: beanz
Subscribers: danalbert, mgorny, srhines, tberghammer, llvm-commits
Differential Revision: https://reviews.llvm.org/D26463
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287220 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Variadic functions can be treated in the same way as normal functions
with respect to the number and types of parameters.
Reviewers: grosbach, olista01, t.p.northover, rengolin
Subscribers: javed.absar, aemerson, llvm-commits
Differential Revision: https://reviews.llvm.org/D26748
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287219 91177308-0d34-0410-b5e6-96231b3b80d8
Register Calling Convention defines a new behavior for v64i1 types.
This type should be saved in GPR.
However for 32 bit machine we need to split the value into 2 GPRs (because each is 32 bit).
Differential Revision: https://reviews.llvm.org/D26181
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287217 91177308-0d34-0410-b5e6-96231b3b80d8
ImplicitNullCheck keeps track of one instruction that the memory
operation depends on that it also hoists with the memory operation.
When hoisting this dependency, it would sometimes clobber a live-in
value to the basic block we were hoisting the two things out of. Fix
this by explicitly looking for such dependencies.
I also noticed two redundant checks on `MO.isDef()` in IsMIOperandSafe.
They're redundant since register MachineOperands are either Defs or Uses
-- there is no third kind. I'll change the checks to asserts in a later
commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287213 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds an option to the build system LLVM_DEPENDENCY_DEBUGGING. Over time I plan to extend this to do more complex verifications, but the initial patch causes compile errors wherever there is missing a dependency on intrinsics_gen.
Because intrinsics_gen is a compile-time dependency not a link-time dependency, everything that relies on the headers generated in intrinsics_gen needs an explicit dependency.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287207 91177308-0d34-0410-b5e6-96231b3b80d8
This patch updates a bunch of places where add_dependencies was being explicitly called to add dependencies on intrinsics_gen to instead use the DEPENDS named parameter. This cleanup is needed for a patch I'm working on to add a dependency debugging mode to the build system.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287206 91177308-0d34-0410-b5e6-96231b3b80d8