This will allow the target to tailor the constraints to the
auto-detected ISA extensions.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Adds tcg_gen_extract_* and tcg_gen_sextract_* for extraction of
fixed position bitfields, much like we already have for deposit.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Previously we allowed fully unaligned operations, but not operations
that are aligned but with less alignment than the operation size.
In addition, arm32, ia64, mips, and sparc had been omitted from the
previous overalignment patch, which would have led to that alignment
being enforced.
Signed-off-by: Richard Henderson <rth@twiddle.net>
These use guard symbols like TCG_TARGET_$target.
scripts/clean-header-guards.pl doesn't like them because they don't
match their file name (they should, to make guard collisions less
likely).
Clean them up: use guard symbol $target_TCG_TARGET_H for
tcg/$target/tcg-target.h.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Some architectures (e.g. ARMv8) need the address which is aligned
to a size more than the size of the memory access.
To support such check it's enough the current costless alignment
check implementation in QEMU, but we need to support
an alignment size specifying.
Signed-off-by: Sergey Sorokin <afarallax@yandex.ru>
Message-Id: <1466705806-679898-1-git-send-email-afarallax@yandex.ru>
Signed-off-by: Richard Henderson <rth@twiddle.net>
[rth: Assert in tcg_canonicalize_memop. Leave get_alignment_bits
available for, though unused by, user-mode. Retain logging difference
based on ALIGNED_ONLY.]
While we can store constants via constrants on INDEX_op_st_i32 et al,
we weren't able to spill constants to backing store.
Add a new backend interface, tcg_out_sti, which may store the constant
(and is allowed to fail). Rearrange the temp_* helpers so that we only
attempt to directly store a constant when the temp is becoming dead/free.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Briefly describe in a comment how direct block chaining is done. It
should help in understanding of the following data fields.
Rename some fields in TranslationBlock and TCGContext structures to
better reflect their purpose (dropping excessive 'tb_' prefix in
TranslationBlock but keeping it in TCGContext):
tb_next_offset => jmp_reset_offset
tb_jmp_offset => jmp_insn_offset
tb_next => jmp_target_addr
jmp_next => jmp_list_next
jmp_first => jmp_list_first
Avoid using a magic constant as an invalid offset which is used to
indicate that there's no n-th jump generated.
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Ensure direct jump patching in AArch64 is atomic by using
atomic_read()/atomic_set() for code patching.
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-9-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Check for CONFIG_DEBUG_TCG instead of NDEBUG, drop now useless code.
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1461228530-14852-2-git-send-email-aurelien@aurel32.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The TCG code is quite performance sensitive, but at the same time can
also be quite tricky. That is why asserts that can be enabled with the
--enable-debug-tcg configure option.
This used to work the following way:
| #include "config.h"
|
| ...
|
| #if !defined(CONFIG_DEBUG_TCG) && !defined(NDEBUG)
| /* define it to suppress various consistency checks (faster) */
| #define NDEBUG
| #endif
|
| ...
|
| #include <assert.h>
Since commit 757e725b (tcg: Clean up includes) "config.h" as been
replaced by "qemu/osdep.h" which itself includes <assert.h>. As a
consequence the assertions are always enabled, even when using
--disable-debug-tcg, causing a performance regression, especially on
targets with many registers. For instance on qemu-system-ppc the
speed difference is about 15%.
tcg_debug_assert is controlled directly by CONFIG_DEBUG_TCG and already
uses in some places. This patch replaces all the calls to assert into
calss to tcg_debug_assert.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1461228530-14852-1-git-send-email-aurelien@aurel32.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 757e725b58 added a number of #include "qemu/osdep.h"
files to the tcg-target.c files (as they were named at the time).
These are unnecessary because these files are not standalone C
files, and the tcg/tcg.c file which includes them will have
already included osdep.h on their behalf. Remove the unneeded
include directives.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1456238983-10160-4-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Rename the per-architecture tcg-target.c files to tcg-target.inc.c.
This makes it clearer that they are not intended to be standalone
C files, but are instead #included into another source file.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1456238983-10160-2-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-16-git-send-email-peter.maydell@linaro.org
In ffc6372851, we swapped the guest
base to the address base register from the address index register.
Except that 31 in the base slot is SP not XZR, so we need to be
more intelligent about which reg gets placed in which slot.
Cc: qemu-stable@nongnu.org (v2.4.0)
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
As we have removed CONFIG_USE_GUEST_BASE, we always use a guest base
and the macros GUEST_BASE and RESERVED_VA become useless: replace
them by their values.
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1440420834-8388-1-git-send-email-laurent@vivier.eu>
Signed-off-by: Richard Henderson <rth@twiddle.net>
All tcg host architectures now support the guest base and as
there is no real performance lost, it can be always enabled.
Anyway, guest base use can be disabled lively by setting guest
base to 0.
CONFIG_USE_GUEST_BASE is defined as (USE_GUEST_BASE && USER_ONLY),
it should have to be replaced by CONFIG_USER_ONLY in non CONFIG_USER_ONLY
parts, but as some other parts are using !CONFIG_SOFTMMU I have chosen to
use !CONFIG_SOFTMMU instead.
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1440373328-9788-2-git-send-email-laurent@vivier.eu>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Rather than allow arbitrary shift+trunc, only concern ourselves
with low and high parts. This is all that was being used anyway.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Implement real ext_i32_i64 and extu_i32_i64 ops. They ensure that a
32-bit value is always converted to a 64-bit value and not propagated
through the register allocator or the optimizer.
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Stefan Weil <sw@weilnetz.de>
Acked-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The op is sometimes named trunc_shr_i32 and sometimes trunc_shr_i64_i32,
and the name in the README doesn't match the name offered to the
frontends.
Always use the long name to make it clear it is a size changing op.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Similar to the same fix for user-mode, except this instance
occurs on the softmmu path. Again, the tlb addend must be
the base register, while the guest address is the index.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Thanks to the previous patch, it is now easy for tcg_out_qemu_ld and
tcg_out_qemu_st to use a 32-bit zero extended offset. However, the
guest base register x28 must be the base and addr_reg must be the
index.
Reported-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1436974021-28978-3-git-send-email-pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The new argument lets you pick uxtw or uxtx mode for the offset
register. For now, all callers pass TCG_TYPE_I64 so that uxtx
is generated. The bits for uxtx are removed from I3312_TO_I3310.
Reported-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1436974021-28978-2-git-send-email-pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The addition of MO_AMASK means that places that used inverted masks
need to be changed to use positive masks, and places that failed to
mask the intended bits need updating.
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Tested-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This will be used to size the TLB when more than 8 MMU modes are
used by the target. Limitations come from the limited size of
the immediate fields (which sometimes, as in the case of Aarch64,
extend to instructions that shift the immediate).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1424436345-37924-2-git-send-email-pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
The extra information is not yet used but it is now available.
This requires minor changes through all of the tcg backends.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
At the tcg opcode level, not at the tcg-op.h generator level.
This requires minor changes through all of the tcg backends,
but none of the cpu translators.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This is less about improved type checking than enabling a
subsequent change to the representation of labels.
Acked-by: Claudio Fontana <claudio.fontana@huawei.com>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The "old" qemu_ld opcode did not specify the size of the result,
and so we had to assume full register width. With the new opcodes,
we can narrow the result.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Since all backends have been converted, remove the compatibility code.
Acked-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The INDEX_op_call case has just been obsoleted; the mov and movi
cases have not been reachable for years. Attempt to document this
both in each tcg_out_op switch, and via TCG_OPF_NOT_PRESENT.
Because of the TCG_OPF_NOT_PRESENT change, this must be done for
all targets in a single commit.
Signed-off-by: Richard Henderson <rth@twiddle.net>
And use tcg pointer differencing functions as appropriate.
Acked-by: Claudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Most 64-bit targets need to be able to ignore the high bits
of a TCG_TYPE_I32 value.
Suggested-by: Stuart Brady <sdb@zubnet.me.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Static code analyzers complain about signed bitfields with only a single
bit. is_ld is used as a boolean value, so make it bool.
ppc64 already used bool for the 2nd argument is_ld of the local function
add_qemu_ldst_label. Modify all other TCG targets to do follow this
example.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The assembler seems to prefer them, perhaps we should too.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Replace aarch64_ldst_op_data with AArch64LdstType, as it wasn't encoded
for the proper shift for the field and was confusing.
Merge aarch64_ldst_op_data, AArch64LdstType, and a few stray opcode bits
into a single I3312_* argument, eliminating some magic numbers from the
helper functions.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Cleaning up the implementation of REV and REV16 at the same time.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Making the bswap conditional on the memop instead of a compile-time test.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
In some cases, a direct branch will be in range.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Some guest env are small enough to reach the tlb with only a 12-bit addition.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Combines 4 other inline functions and tidies the prologue.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
It's obviously call-clobbered, but is otherwise unused.
Repurpose it as the TCG temporary.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
A compare and branch against zero happens at the start of
every single TB.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Rearrange code to put the compare and branch in the same place.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The subset of logical immediates that we support is quite quick to test,
and such constants are quite common to want to load.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
When profitable, initialize the register with MOVN instead of MOVZ,
before setting the remaining lanes with MOVK.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Rather than raw constants that could mean anything.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Cleaning up the implementation of tcg_out_movi at the same time.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Clean up multiply at the same time.
For remainder, generic code will produce mul+sub,
whereas we can implement with msub.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Also tidy the implementation of ubfm, sbfm, extr in order to share code.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Handle a simplified set of logical immediates for the moment.
The way gcc and binutils do it, with 52k worth of tables, and
a binary search depth of log2(5334) = 13, seems slow for the
most common cases.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Avoid the magic numbers in the current implementation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
This merges the implementation of tcg_out_addi and tcg_out_subi.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Converting the add/sub (3.5.2) and logical shifted (3.5.10) instruction
groups to the new scheme.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Tested-by: Claudio Fontana <claudio.fontana@huawei.com>
Commit 023261ef85 failed to remove a
nop that's no longer required.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
At first glance the code appears to be using 1's compliment encoding,
a-la AArch32. Except that the constant is "off", creating a complicated
split field 2's compliment encoding.
Much clearer to just use a normal mask and shift.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
It was unused. Let's not overcomplicate things before we need them.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This reduces the code size of the function significantly.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
We assert that the values for _I32 and _I64 are 0 and 1 respectively.
This will make a couple of functions declared by tcg.c cleaner.
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Removed from other targets in 56bbc2f967.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Step two in the transition, adding the new ldst opcodes. Keep the old
opcodes around until all backends support the new opcodes.
Signed-off-by: Richard Henderson <rth@twiddle.net>
A minimal update to use the new helpers with the return address argument.
Tested-by: Claudio Fontana <claudio.fontana@linaro.org>
Reviewed-by: Claudio Fontana <claudio.fontana@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The _cmmu helpers can be moved to exec-all.h. The helpers that are
used from TCG will shortly need access to tcg_target_long so move
their declarations into tcg.h.
This requires minor include adjustments to all TCG backends.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use them in places where mulu2 and muls2 are used.
Optimize mulx2 with dead low part to mulxh.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
There are several hosts with only a "div" insn. Remainder is computed
manually from the quotient and inputs. We can do this generically.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
implement the 12bit scaled unsigned immediate offset
variant of LDR/STR. This improves code size by avoiding
the movi + ldst_r for naturally aligned offsets in range.
Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
also put aarch64 in the list of archs that do not need an ldscript.
Signed-off-by: Jani Kokkoken <jani.kokkonen@huawei.com>
Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 51AF40EE.1000104@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
implement the optional sign/zero extend operations with the dedicated
aarch64 instructions.
Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 51AC9A58.40502@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
implement the optional byte swap operations with the dedicated
aarch64 instructions.
Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 51AC9A33.9050003@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
for arith operations, add SUBS, ANDS, ADDS and add a shift parameter
so that all arith instructions can make use of shifted registers.
Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 51AC998B.7070506@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
add preliminary support for TCG target aarch64.
Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 51A5C596.3090108@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>