f32 vectors would use a sequence of BFI instructions instead
of unrolled cmp + select. This was better in the case of a VALU
select with SGPR inputs, but we don't have a way of dealing with that
in the DAG.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270731 91177308-0d34-0410-b5e6-96231b3b80d8
LegalizeIntegerTypes does not have a way to expand multiplications for large
integer types (i.e. larger than twice the native bit width). There's no
standard runtime call to use in that case, and so we'd just assert.
Unfortunately, as it turns out, it is possible to hit this case from
standard-ish C code in rare cases. A particular case a user ran into yesterday
involved an __int128 induction variable and a loop with a quadratic (not
linear) recurrence which triggered some backend logic using SCEVExpander. In
this case, the BinomialCoefficient code in SCEV generates some i129 variables,
which get widened to i256. At a high level, this is not actually good (i.e. the
underlying optimization, PPCLoopPreIncPrep, should not be transforming the loop
in question for performance reasons), but regardless, the backend shouldn't
crash because of cost-modeling issues in the optimizer.
This is a straightforward implementation of the multiplication expansion, based
on the algorithm in Hacker's Delight. I validated it against the code for the
mul256b function from http://locklessinc.com/articles/256bit_arithmetic/ using
random inputs. There should be no functional change for previously-working code
(the new expansion code only replaces an assert).
Fixes PR19797.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270720 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of this:
i32.const $push10=, __stack_pointer
i32.load $push11=, 0($pop10)
Emit this:
i32.const $push10=, 0
i32.load $push11=, __stack_pointer($pop10)
It's not currently clear which is better, though there's a chance the second
form may be better at overall compression. We can revisit this when we have
more data; for now it makes sense to make PEI consistent with isel.
Differential Revision: http://reviews.llvm.org/D20411
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270635 91177308-0d34-0410-b5e6-96231b3b80d8
Replace bidirectional flow analysis to compute liveness with forward
analysis pass. Treat lifetimes as starting when there is a first
reference to the stack slot, as opposed to starting at the point of the
lifetime.start intrinsic, so as to increase the number of stack
variables we can overlap.
Reviewers: gbiv, qcolumbet, wmi
Differential Revision: http://reviews.llvm.org/D18827
Bug: 25776
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270559 91177308-0d34-0410-b5e6-96231b3b80d8
They were accidentally using the 32-bit load/store instruction for
8/16-bit operations, due to incorrect patterns
(8/16-bit cmpxchg and atomicrmw will be fixed in subsequent changes)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270486 91177308-0d34-0410-b5e6-96231b3b80d8
The exit-on-error flag on the many_args1.ll test is needed to avoid an
unreachable in BPFTargetLowering::LowerCall. We can also avoid it by ignoring
any superfluous arguments to the call (i.e. any arguments after the first 5).
Fixes PR27766.
Differential Revision: http://reviews.llvm.org/D20471
v2 of r270419
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270440 91177308-0d34-0410-b5e6-96231b3b80d8
This patch reverts r270419 because it broke a lot of buildbots,
mostly Windows. We'd like help in investigating the issues, but
for now, it should stay out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270433 91177308-0d34-0410-b5e6-96231b3b80d8
The exit-on-error flag on the many_args1.ll test is needed to avoid an
unreachable in BPFTargetLowering::LowerCall. We can also avoid it by ignoring
any superfluous arguments to the call (i.e. any arguments after the first 5).
Fixes PR27766
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270419 91177308-0d34-0410-b5e6-96231b3b80d8
Due to an erratum in some versions of LEON, we must insert a NOP after any LD or LDF instruction to ensure the processor has time to load the value correctly before using it. This pass will implement that erratum fix.
The code will have no effect for other Sparc, but non-LEON processors.
Differential Review: http://reviews.llvm.org/D20353
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270417 91177308-0d34-0410-b5e6-96231b3b80d8
This isn't the complete fix, but it handles the trivial examples of duplicate vzero* ops in PR27823:
https://llvm.org/bugs/show_bug.cgi?id=27823
...and amusingly, the bogus cases already exist as regression tests, so let's take this baby step.
We'll need to do more in the general case where there's legitimate AVX usage in the function + there's
already a vzero in the code.
Differential Revision: http://reviews.llvm.org/D20477
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270378 91177308-0d34-0410-b5e6-96231b3b80d8