The pass emits a call to sqrt that has attribute "read-none". This call will be
converted to an ISD::FSQRT node during DAG construction, which will turn into
a mips native sqrt instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183802 91177308-0d34-0410-b5e6-96231b3b80d8
Sign- and zero-extension folding was slightly incorrect because it wasn't checking that the shift on extensions was zero. Further, I recently added AND rd, rn, #255 as a form of 8-bit zero extension, and failed to add the folding code for it.
This patch fixes both issues.
This patch fixes both, and the test should remain the same:
test/CodeGen/ARM/fast-isel-fold.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183794 91177308-0d34-0410-b5e6-96231b3b80d8
Negative zero is returned by the primary expression parser as INT32_MIN, so all that the method needs to do is to accept this value.
Behavior already present for Thumb2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183734 91177308-0d34-0410-b5e6-96231b3b80d8
- Don't use assert(0), or tests may pass or fail according to assertions.
- For now, The tests are marked as XFAIL for win32 hosts.
FIXME: Could we avoid XFAIL to specify triple in the RUN lines?
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183728 91177308-0d34-0410-b5e6-96231b3b80d8
Some ARM CPUs only support ARM mode (ancient v4 ones, for example) and some
only support Thumb mode (M-class ones currently). This makes sure such CPUs
default to the correct mode and makes the AsmParser diagnose an attempt to
switch modes incorrectly.
rdar://14024354
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183710 91177308-0d34-0410-b5e6-96231b3b80d8
Previously LEA64_32r went through virtually the entire backend thinking it was
using 32-bit registers until its blissful illusions were cruelly snatched away
by MCInstLower and 64-bit equivalents were substituted at the last minute.
This patch makes it behave normally, and take 64-bit registers as sources all
the way through. Previous uses (for 32-bit arithmetic) are accommodated via
SUBREG_TO_REG instructions which make the types and classes agree properly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183693 91177308-0d34-0410-b5e6-96231b3b80d8
A plain "sc" without argument is supposed to be treated like "sc 0"
by the assembler. This patch adds a corresponding alias.
Problem reported by Joerg Sonnenberger.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183687 91177308-0d34-0410-b5e6-96231b3b80d8
The extended branch mnemonics are supposed to use an implied CR0
if there is no explicit condition register specified. This patch
adds extra variants of the mnemonics to this effect.
Problem reported by Joerg Sonnenberger.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183686 91177308-0d34-0410-b5e6-96231b3b80d8
This patch removes some redundancy by generating the extended branch
mnemonics via a multiclass.
No change in behaviour expected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183685 91177308-0d34-0410-b5e6-96231b3b80d8
the Mips16 port. A few of the psuedos could either take signed
or unsigned arguments and I did not distinguish the case and improperly
rejected some valid cases that the assembler had previously accepted
when they were pure pseudos that expanded as assembly instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183633 91177308-0d34-0410-b5e6-96231b3b80d8
Changes to ARM unwind opcode assembler:
* Fix multiple .save or .vsave directives. Besides, the
order is preserved now.
* For the directives which will generate multiple opcodes,
such as ".save {r0-r11}", the order of the unwind opcode
is fixed now, i.e. the registers with less encoding value
are popped first.
* Fix the $sp offset calculation. Now, we can use the
.setfp, .pad, .save, and .vsave directives at any order.
Changes to test cases:
* Add test cases to check the order of multiple opcodes
for the .save directive.
* Fix the incorrect $sp offset in the test case. The
stack pointer offset specified in the test case was
incorrect. (Changed test cases: ehabi-mc-section.ll and
ehabi-mc.ll)
* The opcode to restore $sp are slightly reordered. The
behavior are not changed, and the new output is same
as the output of GNU as. (Changed test cases:
eh-directive-pad.s and eh-directive-setfp.s)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183627 91177308-0d34-0410-b5e6-96231b3b80d8
The register classes when emitting loads weren't quite restricting enough, leading to MI verification failure on the result register.
These are new failures that weren't there the first time I tried enabling ARM FastISel for new targets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183624 91177308-0d34-0410-b5e6-96231b3b80d8
Handle the case when the disassembler table can't tell
the difference between some encodings of QADD and CPS.
Add some necessary safe guards in CPS decoding as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183610 91177308-0d34-0410-b5e6-96231b3b80d8
This is using a hint from AMD APP OpenCL Programming Guide with
empirically tweaked parameters.
I used Unigine Heaven 3.0 to determine best parameters on my system
(i7 2600/Radeon 6950/Kernel 3.9.4) the benchmark :
it went from 38.8 average fps to 39.6, which is ~3% gain.
(Lightmark 2008.2 gain is much more marginal: from 537 to 539)
There is no lit test provided as the parameter were determined
empirically and it it would be nearly impossiblet to find a test
program that check for optimal behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183593 91177308-0d34-0410-b5e6-96231b3b80d8
On PPC32, [su]div,rem on i64 types are transformed into runtime library
function calls. As a result, they are not allowed in counter-based loops (the
counter-loops verification pass caught this error; this change fixes PR16169).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183581 91177308-0d34-0410-b5e6-96231b3b80d8
We weren't computing structure size correctly and we were relying on
the original alloca instruction to compute the offset, which isn't
always reliable.
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183568 91177308-0d34-0410-b5e6-96231b3b80d8
This should simplify the subtarget definitions and make it easier to
add new ones.
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183566 91177308-0d34-0410-b5e6-96231b3b80d8
My recent ARM FastISel patch exposed this bug:
http://llvm.org/bugs/show_bug.cgi?id=16178
The root cause is that it can't select integer sext/zext pre-ARMv6 and
asserts out.
The current integer sext/zext code doesn't handle other cases gracefully
either, so this patch makes it handle all sext and zext from i1/i8/i16
to i8/i16/i32, with and without ARMv6, both in Thumb and ARM mode. This
should fix the bug as well as make FastISel faster because it bails to
SelectionDAG less often. See fastisel-ext.patch for this.
fastisel-ext-tests.patch changes current tests to always use reg-imm AND
for 8-bit zext instead of UXTB. This simplifies code since it is
supported on ARMv4t and later, and at least on A15 both should perform
exactly the same (both have exec 1 uop 1, type I).
2013-05-31-char-shift-crash.ll is a bitcode version of the above bug
16178 repro.
fast-isel-ext.ll tests all sext/zext combinations that ARM FastISel
should now handle.
Note that my ARM FastISel enabling patch was reverted due to a separate
failure when dealing with MCJIT, I'll fix this second failure and then
turn FastISel on again for non-iOS ARM targets.
I've tested "make check-all" on my x86 box, and "lnt test-suite" on A15
hardware.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183551 91177308-0d34-0410-b5e6-96231b3b80d8
Reapply 183270 again (because three is a magic number).
This should now no longer seg fault after r183459.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183464 91177308-0d34-0410-b5e6-96231b3b80d8
Previously commited @183279 but tests were failing, reverted @183286
It was broken because @183336 was missing, now it's there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183343 91177308-0d34-0410-b5e6-96231b3b80d8
Add some generic SchedWrites and assign resources for Swift and Cortex A9.
Reapply of r183257. (Removed empty InstRW for division on swift)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183319 91177308-0d34-0410-b5e6-96231b3b80d8
An instruction with less than 3 inputs is trivially a fast immediate shift.
Reapply of 183256, should not have caused the tablegen segfault on linux either.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183314 91177308-0d34-0410-b5e6-96231b3b80d8