------------------------------------------------------------------------
r293635 | nha | 2017-01-31 06:35:37 -0800 (Tue, 31 Jan 2017) | 16 lines
[DAGCombine] require UnsafeFPMath for re-association of addition
Summary:
The affected transforms all implicitly use associativity of addition,
for which we usually require unsafe math to be enabled.
The "Aggressive" flag is only meant to convey information about the
performance of the fused ops relative to a fmul+fadd sequence.
Fixes Bug 31626.
Reviewers: spatel, hfinkel, mehdi_amini, arsenm, tstellarAMD
Subscribers: jholewinski, nemanjai, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D28675
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293940 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r293310 | arsenm | 2017-01-27 09:42:26 -0800 (Fri, 27 Jan 2017) | 8 lines
AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands
Accomplishes what r292982 was supposed to, which ended up
only really making the necessary test changes.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293329 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r292982 | arsenm | 2017-01-24 14:02:15 -0800 (Tue, 24 Jan 2017) | 8 lines
Enable FeatureFlatForGlobal on Volcanic Islands
This switches to the workaround that HSA defaults to
for the mesa path.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293326 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r292473 | arsenm | 2017-01-18 22:35:27 -0800 (Wed, 18 Jan 2017) | 9 lines
AMDGPU: Disable some fneg combines unless nsz
For -(x + y) -> (-x) + (-y), if x == -y, this would
change the result from -0.0 to 0.0. Since the fma/fmad
combine is an extension of this problem it also
applies there.
fmul should be fine, and I don't think any of the unary
operators or conversions should be a problem either.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293319 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r292472 | arsenm | 2017-01-18 22:04:12 -0800 (Wed, 18 Jan 2017) | 5 lines
AMDGPU: Remove modifiers from v_div_scale_*
They seem to produce nonsense results when used.
This should be applied to the release branch.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293317 91177308-0d34-0410-b5e6-96231b3b80d8
This produces worse code when i16 is legal, mostly
due to combines getting confused by conversions inserted
for uniform 16-bit operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291717 91177308-0d34-0410-b5e6-96231b3b80d8
This was shrinking the instruction even though the carry output
register was a virtual register, not known VCC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291716 91177308-0d34-0410-b5e6-96231b3b80d8
Even with aggressive fusion enabled, this requires duplicating
the fmul, or increases an fadd to another fma which is not an
improvement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291642 91177308-0d34-0410-b5e6-96231b3b80d8
When choosing the best successor for a block, ordinarily we would have preferred
a block that preserves the CFG unless there is a strong probability the other
direction. For small blocks that can be duplicated we now skip that requirement
as well.
Differential revision: https://reviews.llvm.org/D27742
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291609 91177308-0d34-0410-b5e6-96231b3b80d8
If a vector index is out of bounds, the result is supposed to be
undefined but is not undefined behavior. Change the legalization
for indexing the vector on the stack so that an out of bounds
index does not create an out of bounds memory access.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291604 91177308-0d34-0410-b5e6-96231b3b80d8
For i16 zeroext arguments when i16 was a legal type, the
known bits information from the truncate was lost. Insert
a zeroext so the known bits optimizations work with the 32-bit
loads.
Fixes code quality regressions vs. SI in min.ll test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291461 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Originally
i64 = umax t8, Constant:i64<4>
was expanded into
i32,i32 = umax Constant:i32<0>, Constant:i32<0>
i32,i32 = umax t7, Constant:i32<4>
Now instead the two produced umax:es return i32 instead of i32, i32.
Thanks to Jan Vesely for help with the test case.
Patch by mikael.holmen at ericsson.com
Reviewers: bogner, jvesely, tstellarAMD, arsenm
Subscribers: test, wdng, RKSimon, arsenm, nhaehnle, llvm-commits
Differential Revision: https://reviews.llvm.org/D28135
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291441 91177308-0d34-0410-b5e6-96231b3b80d8
Canonicalize a select with a constant to the false side. This
enables more instruction shrinking opportunities since an
inline immediate can be used for the false side of v_cndmask_b32_e32.
This seems to usually be better but causes some code size regressions
in some tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290372 91177308-0d34-0410-b5e6-96231b3b80d8
FMA is canonicalized to constant in the middle operand. Do
the same so fmad matches and avoid an extra combine step.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290313 91177308-0d34-0410-b5e6-96231b3b80d8
When the instruction is processed the first time, it may be
deleted resulting in crashes. While the new test adds the same
user to the worklist twice, this particular case doesn't crash
but I'm not sure why.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290191 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Without a MachineMemOperand, the scheduler was assuming MIMG instructions
were ordered memory references, so no loads or stores could be reordered
across them.
Reviewers: arsenm
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D27536
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290179 91177308-0d34-0410-b5e6-96231b3b80d8