In this pattern, all the "magic" bits that we'd `add` are all
high sign bits, and in the value we'd be adding to they are all unset,
not unexpectedly, so we can have an `or` there:
https://rise4fun.com/Alive/ups
It is possible that `haveNoCommonBitsSet()` should be taught about this
pattern so that we never have an `add` variant, but the reasoning would
need to be recursive (because of that `select`), so i'm not really sure
that would be worth it just yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375378 91177308-0d34-0410-b5e6-96231b3b80d8
This adds folds for comparing uadd.sat/usub.sat with zero:
* uadd.sat(a, b) == 0 => a == 0 && b == 0 => (a | b) == 0
* usub.sat(a, b) == 0 => a <= b
And inverted forms for !=.
Differential Revision: https://reviews.llvm.org/D69224
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375374 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This problem consists of several parts:
* Basic sign bit extraction - `trunc? (?shr %x, (bitwidth(x)-1))`.
This is trivial, and easy to do, we have a fold for it.
* Shift amount reassociation - if we have two identical shifts,
and we can simplify-add their shift amounts together,
then we likely can just perform them as a single shift.
But this is finicky, has one-use restrictions,
and shift opcodes must be identical.
But there is a super-pattern where both of these work together.
to produce sign bit test from two shifts + comparison.
We do indeed already handle this in most cases.
But since we get that fold transitively, it has one-use restrictions.
And what's worse, in this case the right-shifts aren't required to be
identical, and we can't handle that transitively:
If the total shift amount is bitwidth-1, only a sign bit will remain
in the output value. But if we look at this from the perspective of
two shifts, we can't fold - we can't possibly know what bit pattern
we'd produce via two shifts, it will be *some* kind of a mask
produced from original sign bit, but we just can't tell it's shape:
https://rise4fun.com/Alive/cM0https://rise4fun.com/Alive/9IN
But it will *only* contain sign bit and zeros.
So from the perspective of sign bit test, we're good:
https://rise4fun.com/Alive/FRzhttps://rise4fun.com/Alive/qBU
Superb!
So the simplest solution is to extend `reassociateShiftAmtsOfTwoSameDirectionShifts()` to also have a
sudo-analysis mode that will ignore extra-uses, and will only check
whether a) those are two right shifts and b) they end up with bitwidth(x)-1
shift amount and return either the original value that we sign-checking,
or null.
This does not have any functionality change for
the existing `reassociateShiftAmtsOfTwoSameDirectionShifts()`.
All that being said, as disscussed in the review, this yet again
increases usage of instsimplify in instcombine as utility.
Some day that may need to be reevaluated.
https://bugs.llvm.org/show_bug.cgi?id=43595
Reviewers: spatel, efriedma, vsk
Reviewed By: spatel
Subscribers: xbolva00, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68930
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375371 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If all the shifts amount are already poison-producing,
then we can add more poison-producing flags ontop:
https://rise4fun.com/Alive/Ocwi
Otherwise, we should only consider the possible range of shift amts that don't result in poison.
For unsigned range not not overflow, we must not shift out any set bits,
and the actual limit for `x` can be computed by backtransforming
the maximal value we could ever get out of the `shl` - `-1` through
`lshr`. If the `x` is any larger than that then it will overflow.
Likewise for signed range, but just in signed domain..
This is based on the general idea outlined by @nikic in https://reviews.llvm.org/D68672#1714990
Reviewers: nikic, sanjoy
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits, nikic
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69217
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375370 91177308-0d34-0410-b5e6-96231b3b80d8
Enumerate one less constant range in TestNoWrapRegionExhaustive,
which was unnecessary. This allows us to bump the bit count from
3 to 5 while keeping reasonable timing.
Drop four tests for multiply nowrap regions, as these cover subsets
of the exhaustive test. They do use a wider bitwidth, but I don't
think it's worthwhile to have them additionally now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375369 91177308-0d34-0410-b5e6-96231b3b80d8
Avoids a test regression in a future patch. Also add debug printing on
this case, so I waste less time debugging folds in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375367 91177308-0d34-0410-b5e6-96231b3b80d8
We handle it this way for some other address spaces.
Since r349196, SILoadStoreOptimizer has been trying to do this. This
is after SIFoldOperands runs, which can change the addressing
patterns. It's simpler to just split this earlier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375366 91177308-0d34-0410-b5e6-96231b3b80d8
This patch tries to resolve problems faced in D68943
and uses some of the code written by Konrad Wilhelm Kleine
in that patch.
Previously, yaml2obj tool always created a .symtab section.
This patch changes that. With it we only create it when
have a "Symbols:" tag in the YAML document or when
we need to create it because it is used by another section(s).
obj2yaml follows the new behavior and does not print "Symbols:"
anymore when there is no symbol table.
Differential revision: https://reviews.llvm.org/D69041
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375361 91177308-0d34-0410-b5e6-96231b3b80d8
This is a common idiom which arises after induction variables are widened, and we have two or more exit conditions. Interestingly, we don't have instcombine or instsimplify support for this either.
Differential Revision: https://reviews.llvm.org/D69006
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375349 91177308-0d34-0410-b5e6-96231b3b80d8
The unit test uses GlobalISel but the dependency is not listed in the
CMakeLists.txt file which causes failures in shared libs build with GCC.
This just adds the dependency.
Differential revision: https://reviews.llvm.org/D69064
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375346 91177308-0d34-0410-b5e6-96231b3b80d8
tryToWidenViaDuplication lowers using the shuffle_v8i16(unpack_v16i8(shuffle_v8i16(x),shuffle_v8i16(x))) pattern, but the unpack only needs the even/odd 16i8 args if the original v16i8 shuffle mask references the even/odd elements - which isn't true for many extension style shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375342 91177308-0d34-0410-b5e6-96231b3b80d8
It's not clear why the test had this. I'm unable to break the original
case with the original patch reverted with or without optnone.
This avoids a failure in a future commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375321 91177308-0d34-0410-b5e6-96231b3b80d8
Works on this dependency chain:
ArrayRef.h ->
Hashing.h -> --CUT--
Host.h ->
StringMap.h / StringRef.h
ArrayRef is very popular, but Host.h is rarely needed. Move the
IsBigEndianHost constant to SwapByteOrder.h. Clients of that header are
more likely to need it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375316 91177308-0d34-0410-b5e6-96231b3b80d8
MachineInstr.h included AliasAnalysis.h, which includes a world of IR
constructs mostly unneeded in CodeGen. Prune it. Same for
DebugInfoMetadata.h.
Noticed with -ftime-trace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375311 91177308-0d34-0410-b5e6-96231b3b80d8
If a subregister def was moved across another subregister def and
another use, the main range was not correctly updated. The end point
of the moved interval ended too early and missed the use from theh
other lanes in the subreg def.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375300 91177308-0d34-0410-b5e6-96231b3b80d8
by ExtBinary format profile
Profile on-demand loading was added for ExtBinary format profile in rL374233,
but currently profile on-demand loading doesn't work well with profile
remapping. The patch adds the support.
Suppose a function in the current module has outline instance in the profile.
The function name in the module is different from the name of the outline
instance, but remapper knows the two names are equal. When loading profile
on-demand, the outline instance has to be loaded with remapper's help.
At the same time SampleProfileReaderItaniumRemapper is changed from a proxy
of SampleProfileReader to a helper member in SampleProfileReader.
Differential Revision: https://reviews.llvm.org/D68901
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375295 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: The implementation was never completed and never used except in tests.
Reviewers: arsenm, mareko
Subscribers: qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69163
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375293 91177308-0d34-0410-b5e6-96231b3b80d8
This is really embarrassing. Those are pointers, so that offsets the
pointers, not the statistics pointed-by the pointer...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375290 91177308-0d34-0410-b5e6-96231b3b80d8
Occasionally, during test teardown, LLDB writes to a closed pipe.
Sometimes the communication is inherently unreliable, so LLDB tries to
avoid being killed due to SIGPIPE (it calls `signal(SIGPIPE, SIG_IGN)`).
However, LLVM's default SIGPIPE behavior overrides LLDB's, causing it to
exit with IO_ERR.
Opt LLDB out of the default SIGPIPE behavior. I expect that this will
resolve some LLDB test suite flakiness (tests randomly failing with
IO_ERR) that we've seen since r344372.
rdar://55750240
Differential Revision: https://reviews.llvm.org/D69148
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375288 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, the parser checked for a '%' prefix to indicate a register.
In Intel syntax mode, LLVM does not print a '%' prefix on registers, so
LLVM could not parse its own assembly output. Instead, require that
register numbers be integer literals, or at least start with an integer
literal, which is consistent with .cfi_* directive register parsing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375287 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Also changes the wasm YAML format to reflect the possibility of having
multiple return types and to put the returns after the params for
consistency with the binary encoding.
Reviewers: aheejin, sbc100
Subscribers: dschuff, jgravelle-google, hiraditya, sunfish, arphaman, rupprecht, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69156
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375283 91177308-0d34-0410-b5e6-96231b3b80d8
The default implementation of isIncomingArgumentHandler could lead
to generating incorrect code.
Make it a pure virtual method, so that targets know they have to
override it to produce correct code.
NFC
Differential Revision: https://reviews.llvm.org/D69187
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375277 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
CVP, unlike InstCombine, does not run till exaustion.
It only does a single pass.
When dealing with those special binops, if we prove that they can
safely be demoted into their usual binop form,
we do set the no-wrap we deduced. But when dealing with usual binops,
we try to deduce both no-wraps.
So if we convert e.g. @llvm.uadd.with.overflow() to `add nuw`,
we won't attempt to check whether it can be `add nuw nsw`.
This patch proposes to call `processBinOp()` on newly-created binop,
which is identical to what we do for div/rem already.
Reviewers: nikic, spatel, reames
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69183
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375273 91177308-0d34-0410-b5e6-96231b3b80d8
Mostly use SReg_32 instead of SReg_32_XM0 for arbitrary values. This
will allow the register coalescer to do a better job eliminating
copies to m0.
For GlobalISel, as a terrible hack, use SGPR_32 for things that should
use SCC until booleans are solved.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375267 91177308-0d34-0410-b5e6-96231b3b80d8
JITLink is LLVM's newer jit-linker. It is an alternative to (and hopefully
eventually a replacement for) LLVM's older jit-linker, RuntimeDyld. Unlike
RuntimeDyld which requries JIT'd code to be complied with the large code
model, JITlink can link code compiled with the small code model, which is
the native code model for a number of targets (including all supported MachO
targets).
This example shows how to:
-- Create a JITLink InProcessMemoryManager
-- Set the code model to small
-- Use a JITLink backed ObjectLinkingLayer as the linking layer for LLJIT
(rather than the default RTDyldObjectLinkingLayer).
Note: This example will only work on platforms supported by JITLink. As of
this commit that's MachO/x86-64 and MachO/arm64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375266 91177308-0d34-0410-b5e6-96231b3b80d8