------------------------------------------------------------------------
r295762 | eugenis | 2017-02-21 12:17:34 -0800 (Tue, 21 Feb 2017) | 3 lines
Fix PR31896.
Address of an alias of a global with offset is incorrectly lowered as an address of the global (i.e. ignoring offset).
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@296002 91177308-0d34-0410-b5e6-96231b3b80d8
Support lowering AEABI TLS access (__aeabi_read_tp) with long calls.
This requires adjusting the call sequence to use an indirect call to get
full addressability.
Resolves PR31769!
By Saleem Abdulrasool!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@295910 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r295512 | matze | 2017-02-17 15:15:03 -0800 (Fri, 17 Feb 2017) | 8 lines
AArch64LoadStoreOptimizer: Correctly clear kill flags
When promoting the Load of a Store-Load pair to a COPY all kill flags
between the store and the load need to be cleared.
rdar://30402435
Differential Revision: https://reviews.llvm.org/D30110
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@295744 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r294527 | arnolds | 2017-02-08 14:30:47 -0800 (Wed, 08 Feb 2017) | 14 lines
[ARM/AArch ISel] SwiftCC: First parameters that are marked swiftself are not 'this returns'
We mark X0 as preserved by a call that passes the returned parameter.
x0 = ...
fun(x0) // no implicit def of x0
This no longer is valid if we pass the parameter in a different register then
the returned value as is the case with a swiftself parameter (passed in x20).
x20 = ...
fun(x20) // there should be an implict def of x8
rdar://30425845
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@295135 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r294551 | arnolds | 2017-02-08 17:52:17 -0800 (Wed, 08 Feb 2017) | 10 lines
SwiftCC: swifterror register cannot be as the base register
Functions that have a dynamic alloca require a base register which is defined to
be X19 on AArch64 and r6 on ARM. We have defined the swifterror register to be
the same register. Use a different callee save register for swifterror instead:
X21 on AArch64
R8 on ARM
rdar://30433803
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@295079 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r294348 | hans | 2017-02-07 12:37:45 -0800 (Tue, 07 Feb 2017) | 6 lines
[X86] Disable conditional tail calls (PR31257)
They are currently modelled incorrectly (as calls, which clobber
registers, confusing e.g. Machine Copy Propagation).
Reverting until we figure out the proper solution.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@294476 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r294203 | john.brawn | 2017-02-06 10:07:20 -0800 (Mon, 06 Feb 2017) | 9 lines
[AArch64] Fix incorrect MachinePointerInfo in splitStoreSplat
When splitting up one store into several in splitStoreSplat we have to
make sure we get the MachinePointerInfo right, otherwise alias
analysis thinks they all store to the same location. This can then
cause invalid scheduling later on.
Differential Revision: https://reviews.llvm.org/D29446
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@294242 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r292624 | petarj | 2017-01-20 09:53:30 -0800 (Fri, 20 Jan 2017) | 9 lines
[mips] Fix debug information for __thread variable
This patch fixes debug information for __thread variable on Mips
using .dtprelword and .dtpreldword directives.
Patch by Aleksandar Beserminji.
Differential Revision: http://reviews.llvm.org/D28770
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293664 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r293417 | jhibbits | 2017-01-28 20:55:57 -0800 (Sat, 28 Jan 2017) | 16 lines
Add some Book-E instructions to the asm parser and printer.
Summary:
Adds the following instructions:
* mfpmr
* mtpmr
* icblc
* icblq
* icbtls
Fix the scheduling for mtspr on e5500, which uses CFX0, instead of
SFX0/SFX1 as on e500mc.
Addresses PR 31538.
Differential Revision: https://reviews.llvm.org/D29002
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293651 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r293310 | arsenm | 2017-01-27 09:42:26 -0800 (Fri, 27 Jan 2017) | 8 lines
AMDGPU: Enable FeatureFlatForGlobal on Volcanic Islands
Accomplishes what r292982 was supposed to, which ended up
only really making the necessary test changes.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293329 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r292982 | arsenm | 2017-01-24 14:02:15 -0800 (Tue, 24 Jan 2017) | 8 lines
Enable FeatureFlatForGlobal on Volcanic Islands
This switches to the workaround that HSA defaults to
for the mesa path.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@miletic.net>
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293326 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r292473 | arsenm | 2017-01-18 22:35:27 -0800 (Wed, 18 Jan 2017) | 9 lines
AMDGPU: Disable some fneg combines unless nsz
For -(x + y) -> (-x) + (-y), if x == -y, this would
change the result from -0.0 to 0.0. Since the fma/fmad
combine is an extension of this problem it also
applies there.
fmul should be fine, and I don't think any of the unary
operators or conversions should be a problem either.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293319 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r292472 | arsenm | 2017-01-18 22:04:12 -0800 (Wed, 18 Jan 2017) | 5 lines
AMDGPU: Remove modifiers from v_div_scale_*
They seem to produce nonsense results when used.
This should be applied to the release branch.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293317 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r293259 | compnerd | 2017-01-26 19:41:53 -0800 (Thu, 26 Jan 2017) | 11 lines
ARM: fix vectorized division on WoA
The Windows on ARM target uses custom division for normal division as
the backend needs to insert division-by-zero checks. However, it is
designed to only handle non-vectorized division. ARM has custom
lowering for vectorized division as that can avoid loading registers
with the values and invoke a division routine for each one, preferring
to lower using NEON instructions. Fall back to the custom lowering for
the NEON instructions if we encounter a vectorized division.
Resolves PR31778!
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293306 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r292712 | ctopper | 2017-01-20 22:59:35 -0800 (Fri, 20 Jan 2017) | 1 line
[X86] Add test cases that show bad commuting being allowed to create a phsub operation.
------------------------------------------------------------------------
------------------------------------------------------------------------
r292713 | ctopper | 2017-01-20 22:59:38 -0800 (Fri, 20 Jan 2017) | 3 lines
[X86] Don't allow commuting to form phsub operations.
Fixes PR31714.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293299 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r292516 | rserge | 2017-01-19 12:24:23 -0800 (Thu, 19 Jan 2017) | 14 lines
[XRay][Arm] Repair XRay table emission on Arm32 and add tests to identify such problem earlier
Summary:
Emission of XRay table was occasionally disabled for Arm32, but this bug was not then detected because earlier (also by mistake) testing of XRay was occasionally disabled on 32-bit Arm targets. This patch should fix that problem and detect such problems in the future.
This patch is one of a series, see also
- https://reviews.llvm.org/D28623
Reviewers: rengolin, dberris
Reviewed By: dberris
Subscribers: llvm-commits, aemerson, rengolin, dberris, iid_iunknown
Differential Revision: https://reviews.llvm.org/D28624
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293295 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r293000 | thomas.stellard | 2017-01-24 17:25:13 -0800 (Tue, 24 Jan 2017) | 15 lines
AMDGPU add support for spilling to a user sgpr pointed buffers
Summary:
This lets you select which sort of spilling you want, either s[0:1] or 64-bit loads from s[0:1].
Patch By: Dave Airlie
Reviewers: nhaehnle, arsenm, tstellarAMD
Reviewed By: arsenm
Subscribers: mareko, llvm-commits, kzhuravl, wdng, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D25428
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293240 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r292444 | mkuper | 2017-01-18 15:05:58 -0800 (Wed, 18 Jan 2017) | 7 lines
Revert r291670 because it introduces a crash.
r291670 doesn't crash on the original testcase from PR31589,
but it crashes on a slightly more complex one.
PR31589 has the new reproducer.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@293070 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r291909 | compnerd | 2017-01-13 08:25:33 -0800 (Fri, 13 Jan 2017) | 9 lines
ARM: match GCC's behaviour for builtins
GCC changes the CC between the user-code and the builtins based on the
value of `-target` rather than `-mfloat-abi`. When a HF target is used,
the VFP variant of the AAPCS CC is used. Otherwise, the AAPCS variant
is used. In all cases, the AEABI functions use the AAPCS CC. Adjust
the calling convention based on the target.
Resolves PR30543!
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292951 91177308-0d34-0410-b5e6-96231b3b80d8
------------------------------------------------------------------------
r292242 | bwilson | 2017-01-17 11:18:57 -0800 (Tue, 17 Jan 2017) | 5 lines
Revert r291640 change to fold X86 comparison with atomic_load_add.
Even with the fix from r291630, this still causes problems. I get
widespread assertion failures in the Swift runtime's WeakRefCount::increment()
function. I sent a reduced testcase in reply to the commit.
------------------------------------------------------------------------
git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_40@292243 91177308-0d34-0410-b5e6-96231b3b80d8
Emit SHRQ/SHLQ instead of ANDQ with a 64 bit constant mask if the result
is unused and the mask has only higher/lower bits set. For example, with
this patch LLVM emits
shrq $41, %rdi
je
instead of
movabsq $0xFFFFFE0000000000, %rcx
testq %rcx, %rdi
je
This reduces number of instructions, code size and register pressure.
The transformation is applied only for cases where the mask cannot be
encoded as an immediate value within TESTQ instruction.
Differential Revision: https://reviews.llvm.org/D28198
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291806 91177308-0d34-0410-b5e6-96231b3b80d8
64-bit integer division in Intel CPUs is extremely slow, much slower
than 32-bit division. On the other hand, 8-bit and 16-bit divisions
aren't any faster. The only important exception is Atom where DIV8
is fastest. Because of that, the patch
1) Enables bypassing of 64-bit division for Atom, Silvermont and
all big cores.
2) Modifies 64-bit bypassing to use 32-bit division instead of
16-bit one. This doesn't make the shorter division slower but
increases chances of taking it. Moreover, it's much more likely
to prove at compile-time that a value fits 32 bits and doesn't
require a run-time check (e.g. zext i32 to i64).
Differential Revision: https://reviews.llvm.org/D28196
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291800 91177308-0d34-0410-b5e6-96231b3b80d8
Switch some additional library call setup to be table driven. This
makes it more immediately obvious what the library call looks like.
This is important for ARM since the calling conventions for the builtins
change based on the target/libcall name. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291789 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The register bank is now entirely initialized in the constructor. However,
we still have the hardcoded number of register classes which will be
dealt with in the TableGen patch (D27338) since we do not have access
to this information to resolve this at this stage. The number of register
classes is known to the TRI and to TableGen but the RegisterBank
constructor is too early for the former and too late for the latter.
This will be fixed when the data is tablegen-erated.
Reviewers: t.p.northover, ab, rovka, qcolombet
Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris
Differential Revision: https://reviews.llvm.org/D27809
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291770 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Refactor the RegisterBank initialization to use static data. This requires
GlobalISel implementations to rewrite calls to createRegisterBank() and
addRegBankCoverage() into a call to setRegBankData().
Out of tree targets can use diff 4 of D27807
(https://reviews.llvm.org/D27807?id=84117) to have addRegBankCoverage() dump
the register classes and other data that needs to be provided to
setRegBankData(). This is the method that was used to generate the static data
in this patch.
Tablegen-eration of this static data will follow after some refactoring.
Reviewers: t.p.northover, ab, rovka, qcolombet
Subscribers: aditya_nandakumar, kristof.beyls, vkalintiris, llvm-commits, dberris
Differential Revision: https://reviews.llvm.org/D27807
Differential Revision: https://reviews.llvm.org/D27808
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291768 91177308-0d34-0410-b5e6-96231b3b80d8
r289653 added a case where `vselect <cond> <vector1> <all-zeros>`
is transformed to:
`vselect xor(cond, DAG.getConstant(1, DL, CondVT) <all-zeros> <vector1>`
This was not aimed to catch cases where Cond is not a vXi1
mask but it does. Moreover, when Cond type is VxiN (N > 1)
then xor(cond, DAG.getConstant(1, DL, CondVT) != NOT(cond).
This patch changes the above to xor with allones, and avoids
entering the case for non-mask Conds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291745 91177308-0d34-0410-b5e6-96231b3b80d8
This produces worse code when i16 is legal, mostly
due to combines getting confused by conversions inserted
for uniform 16-bit operations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291717 91177308-0d34-0410-b5e6-96231b3b80d8
This was shrinking the instruction even though the carry output
register was a virtual register, not known VCC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291716 91177308-0d34-0410-b5e6-96231b3b80d8