Lower the target independent signed saturating intrinsics to qadd8 and qadd16.
This custom lowers them from a sadd_sat, catching the node early before it is
promoted. It also adds a QADD8b and QADD16b node to mean the bottom "lane" of a
qadd8/qadd16, so that we can call demand bits on it to show that it does not
use the upper bits.
Also handles QSUB8 and QSUB16.
Differential Revision: https://reviews.llvm.org/D68974
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375402 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Allow for ignoring the check for a single use in SimplifyDemandedVectorElts
to be able to simplify operands if DemandedElts is known to contain
the union of elements used by all users.
It is a responsibility of a caller of SimplifyDemandedVectorElts to
supply correct DemandedElts.
Simplify a series of extractelement instructions if only a subset of
elements is used.
Reviewers: reames, arsenm, majnemer, nhaehnle
Reviewed By: nhaehnle
Subscribers: wdng, jvesely, nhaehnle, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67345
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375395 91177308-0d34-0410-b5e6-96231b3b80d8
Current implementation of Instruction::mayReadFromMemory()
returns !doesNotAccessMemory() which is !ReadNone. This
does not take into account that the writeonly attribute
also indicates that the call does not read from memory.
The patch changes the predicate to !doesNotReadMemory()
that reflects the intended behavior.
Differential Revision: https://reviews.llvm.org/D69086
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375389 91177308-0d34-0410-b5e6-96231b3b80d8
AAReturnedValues, AAMemoryBehavior, and AANoUnwind, can provide
information that helps during the tracking or even justifies no-capture.
We now use this information and enable no-capture in some test cases
designed a long while a ago for these cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375382 91177308-0d34-0410-b5e6-96231b3b80d8
We can end up with two loop exits whose exit counts are equivalent, but whose textual representation is different and non-obvious. For the sub-case where we have a series of exits which dominate one another (common), eliminate any exits which would iterate *after* a previous exit on the exiting iteration.
As noted in the TODO being removed, I'd always thought this was a good idea, but I've now seen this in a real workload as well.
Interestingly, in review, Nikita pointed out there's let another oppurtunity to leverage SCEV's reasoning. If we kept track of the min of dominanting exits so far, we could discharge exits with EC >= MDE. This is less powerful than the existing transform (since later exits aren't considered), but potentially more powerful for any case where SCEV can prove a >= b, but neither a == b or a > b. I don't have an example to illustrate that oppurtunity, but won't be suprised if we find one and return to handle that case as well.
Differential Revision: https://reviews.llvm.org/D69009
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375379 91177308-0d34-0410-b5e6-96231b3b80d8
In this pattern, all the "magic" bits that we'd `add` are all
high sign bits, and in the value we'd be adding to they are all unset,
not unexpectedly, so we can have an `or` there:
https://rise4fun.com/Alive/ups
It is possible that `haveNoCommonBitsSet()` should be taught about this
pattern so that we never have an `add` variant, but the reasoning would
need to be recursive (because of that `select`), so i'm not really sure
that would be worth it just yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375378 91177308-0d34-0410-b5e6-96231b3b80d8
This adds folds for comparing uadd.sat/usub.sat with zero:
* uadd.sat(a, b) == 0 => a == 0 && b == 0 => (a | b) == 0
* usub.sat(a, b) == 0 => a <= b
And inverted forms for !=.
Differential Revision: https://reviews.llvm.org/D69224
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375374 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This problem consists of several parts:
* Basic sign bit extraction - `trunc? (?shr %x, (bitwidth(x)-1))`.
This is trivial, and easy to do, we have a fold for it.
* Shift amount reassociation - if we have two identical shifts,
and we can simplify-add their shift amounts together,
then we likely can just perform them as a single shift.
But this is finicky, has one-use restrictions,
and shift opcodes must be identical.
But there is a super-pattern where both of these work together.
to produce sign bit test from two shifts + comparison.
We do indeed already handle this in most cases.
But since we get that fold transitively, it has one-use restrictions.
And what's worse, in this case the right-shifts aren't required to be
identical, and we can't handle that transitively:
If the total shift amount is bitwidth-1, only a sign bit will remain
in the output value. But if we look at this from the perspective of
two shifts, we can't fold - we can't possibly know what bit pattern
we'd produce via two shifts, it will be *some* kind of a mask
produced from original sign bit, but we just can't tell it's shape:
https://rise4fun.com/Alive/cM0https://rise4fun.com/Alive/9IN
But it will *only* contain sign bit and zeros.
So from the perspective of sign bit test, we're good:
https://rise4fun.com/Alive/FRzhttps://rise4fun.com/Alive/qBU
Superb!
So the simplest solution is to extend `reassociateShiftAmtsOfTwoSameDirectionShifts()` to also have a
sudo-analysis mode that will ignore extra-uses, and will only check
whether a) those are two right shifts and b) they end up with bitwidth(x)-1
shift amount and return either the original value that we sign-checking,
or null.
This does not have any functionality change for
the existing `reassociateShiftAmtsOfTwoSameDirectionShifts()`.
All that being said, as disscussed in the review, this yet again
increases usage of instsimplify in instcombine as utility.
Some day that may need to be reevaluated.
https://bugs.llvm.org/show_bug.cgi?id=43595
Reviewers: spatel, efriedma, vsk
Reviewed By: spatel
Subscribers: xbolva00, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68930
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375371 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
If all the shifts amount are already poison-producing,
then we can add more poison-producing flags ontop:
https://rise4fun.com/Alive/Ocwi
Otherwise, we should only consider the possible range of shift amts that don't result in poison.
For unsigned range not not overflow, we must not shift out any set bits,
and the actual limit for `x` can be computed by backtransforming
the maximal value we could ever get out of the `shl` - `-1` through
`lshr`. If the `x` is any larger than that then it will overflow.
Likewise for signed range, but just in signed domain..
This is based on the general idea outlined by @nikic in https://reviews.llvm.org/D68672#1714990
Reviewers: nikic, sanjoy
Reviewed By: nikic
Subscribers: hiraditya, llvm-commits, nikic
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69217
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375370 91177308-0d34-0410-b5e6-96231b3b80d8
Enumerate one less constant range in TestNoWrapRegionExhaustive,
which was unnecessary. This allows us to bump the bit count from
3 to 5 while keeping reasonable timing.
Drop four tests for multiply nowrap regions, as these cover subsets
of the exhaustive test. They do use a wider bitwidth, but I don't
think it's worthwhile to have them additionally now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375369 91177308-0d34-0410-b5e6-96231b3b80d8
Avoids a test regression in a future patch. Also add debug printing on
this case, so I waste less time debugging folds in the future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375367 91177308-0d34-0410-b5e6-96231b3b80d8
We handle it this way for some other address spaces.
Since r349196, SILoadStoreOptimizer has been trying to do this. This
is after SIFoldOperands runs, which can change the addressing
patterns. It's simpler to just split this earlier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375366 91177308-0d34-0410-b5e6-96231b3b80d8
This patch tries to resolve problems faced in D68943
and uses some of the code written by Konrad Wilhelm Kleine
in that patch.
Previously, yaml2obj tool always created a .symtab section.
This patch changes that. With it we only create it when
have a "Symbols:" tag in the YAML document or when
we need to create it because it is used by another section(s).
obj2yaml follows the new behavior and does not print "Symbols:"
anymore when there is no symbol table.
Differential revision: https://reviews.llvm.org/D69041
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375361 91177308-0d34-0410-b5e6-96231b3b80d8
This is a common idiom which arises after induction variables are widened, and we have two or more exit conditions. Interestingly, we don't have instcombine or instsimplify support for this either.
Differential Revision: https://reviews.llvm.org/D69006
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375349 91177308-0d34-0410-b5e6-96231b3b80d8
The unit test uses GlobalISel but the dependency is not listed in the
CMakeLists.txt file which causes failures in shared libs build with GCC.
This just adds the dependency.
Differential revision: https://reviews.llvm.org/D69064
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375346 91177308-0d34-0410-b5e6-96231b3b80d8
tryToWidenViaDuplication lowers using the shuffle_v8i16(unpack_v16i8(shuffle_v8i16(x),shuffle_v8i16(x))) pattern, but the unpack only needs the even/odd 16i8 args if the original v16i8 shuffle mask references the even/odd elements - which isn't true for many extension style shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375342 91177308-0d34-0410-b5e6-96231b3b80d8
It's not clear why the test had this. I'm unable to break the original
case with the original patch reverted with or without optnone.
This avoids a failure in a future commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375321 91177308-0d34-0410-b5e6-96231b3b80d8
Works on this dependency chain:
ArrayRef.h ->
Hashing.h -> --CUT--
Host.h ->
StringMap.h / StringRef.h
ArrayRef is very popular, but Host.h is rarely needed. Move the
IsBigEndianHost constant to SwapByteOrder.h. Clients of that header are
more likely to need it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375316 91177308-0d34-0410-b5e6-96231b3b80d8
MachineInstr.h included AliasAnalysis.h, which includes a world of IR
constructs mostly unneeded in CodeGen. Prune it. Same for
DebugInfoMetadata.h.
Noticed with -ftime-trace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375311 91177308-0d34-0410-b5e6-96231b3b80d8
If a subregister def was moved across another subregister def and
another use, the main range was not correctly updated. The end point
of the moved interval ended too early and missed the use from theh
other lanes in the subreg def.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375300 91177308-0d34-0410-b5e6-96231b3b80d8