We weren't ignoring the high 32 bits during a NE comparison.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
* 'tci' of git://qemu.weilnetz.de/qemu:
tci: Make tcg temporaries local to tcg_qemu_tb_exec
tci: Delete unused tb_ret_addr
tci: Avoid code before declarations
tci: Use a local variable for env
tci: Use 32-bit signed offsets to loads/stores
We're moving away from the temporaries stored in env. Make sure we can
differentiate between temp stores and possibly bogus stores for extra
call arguments. Move TCG_AREG0 and TCG_REG_CALL_STACK out of the way
of the parameter passing registers.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off by: Stefan Weil <sw@weilnetz.de>
Since the change to tcg_exit_req, the first insn of every TB is
a load with a negative offset from env.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off by: Stefan Weil <sw@weilnetz.de>
When the TCG condition codes were re-organized last year,
we failed to update all of the "old-style" tests for unsigned.
Signed-off-by: Richard Henderson <rth@twiddle.net>
This can save one insn, if the constant has any bits in 32-63 set,
but no bits in 21-31 set. It never results in more insns.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Since we're always in 64-bit mode, load address performs a full
64-bit add. Use that for 3-address addition, as well as for
larger constant addends when we lack extended-immediates facility.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Since we have a free temporary and can always just load the constant, we
ought to do so, rather than spending the same effort constraining the const.
Signed-off-by: Richard Henderson <rth@twiddle.net>
We only support 64-bit code generation for s390x.
Don't clutter the code with ifdefs that suggest otherwise.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Set TCG_TARGET_CALL_STACK_OFFSET properly for the abi. Allocate the
standard TCG_STATIC_CALL_ARGS_SIZE. And while we're at it, allocate
space for CPU_TEMP_BUF_NLONGS.
Signed-off-by: Richard Henderson <rth@twiddle.net>
In TCG, "target" means the host architecture for which TCG generates
the code. Using "guest" rather than "target" to make the document more
consistent.
Signed-off-by: Chen Wei-Ren <chenwj@iis.sinica.edu.tw>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Fix some of the nasty TCG race conditions and crashes by implementing
cpu_exit() as setting a flag which is checked at the start of each TB.
This avoids crashes if a thread or signal handler calls cpu_exit()
while the execution thread is itself modifying the TB graph (which
may happen in system emulation mode as well as in linux-user mode
with a multithreaded guest binary).
This fixes the crashes seen in LP:668799; however there are another
class of crashes described in LP:1098729 which stem from the fact
that in linux-user with a multithreaded guest all threads will
use and modify the same global TCG date structures (including the
generated code buffer) without any kind of locking. This means that
multithreaded guest binaries are still in the "unsupported"
category.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Document tcg_qemu_tb_exec(). In particular, its return value is a
combination of a pointer to the next translation block and some
extra information in the low two bits. Provide some #defines for
the values passed in these bits to improve code clarity.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Switch the default for qemu_log logging output from "/tmp/qemu.log"
to stderr. This is an incompatible change in some sense, but logging
is mostly used for debugging purposes so it shouldn't affect production
use. The previous behaviour can be obtained by adding "-D /tmp/qemu.log"
to the command line.
This change requires us to:
* update all the documentation/help text (we take the opportunity
to smooth out minor inconsistencies between the phrasing in
linux-user/bsd-user/system help messages)
* make linux-user and bsd-user defer to qemu-log for the default
logging destination rather than overriding it themselves
* ensure that all logfile closing is done via qemu_log_close()
and that that function doesn't close stderr
as well as the obvious change to the behaviour of do_qemu_set_log()
when no logfile name has been specified.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1361901160-28729-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We even had the encoding of smull already handy...
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
We're going to have use for this shortly in implementing other helpers.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Commit 0b0d3320db (TCG: Final globals
clean-up) moved code_gen_prologue but forgot to update ppc code.
This broke the build on 32-bit ppc. ppc64 is unaffected.
Cc: Evgeny Voevodin <evgenyvoevodin@gmail.com>
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Rename the public-facing function cpu_set_log to qemu_set_log. This
requires us to rename the internal-only qemu_set_log() to
do_qemu_set_log().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
It's worth to clean-up translation blocks variables and move them
into one context as was suggested by Swirl.
Also if we use this context directly inside tcg_ctx, then it
speeds up code generation a bit.
Signed-off-by: Evgeny Voevodin <evgenyvoevodin@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Silence a (legitimate) complaint about missing parentheses:
tcg/arm/tcg-target.c: In function ‘tcg_out_qemu_ld’:
tcg/arm/tcg-target.c:1148:5: error: suggest parentheses around
comparison in operand of ‘&’ [-Werror=parentheses]
tcg/arm/tcg-target.c: In function ‘tcg_out_qemu_st’:
tcg/arm/tcg-target.c:1357:5: error: suggest parentheses around
comparison in operand of ‘&’ [-Werror=parentheses]
which meant that we would mistakenly always assert if running
a QEMU built with debug enabled on ARM.
Signed-off-by: Peter Maydell <peter.maydelL@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This adds two optimizations using the non-zero bit mask. In some cases
involving shifts or ANDs the value can become zero, and can thus be
optimized to a move of zero. Second, useless zero-extension or an
AND with constant can be detected that would only zero bits that are
already zero.
The main advantage of this optimization is that it turns zero-extensions
into moves, thus enabling much better copy propagation (around 1% code
reduction). Here is for example a "test $0xff0000,%ecx + je" before
optimization:
mov_i64 tmp0,rcx
movi_i64 tmp1,$0xff0000
discard cc_src
and_i64 cc_dst,tmp0,tmp1
movi_i32 cc_op,$0x1c
ext32u_i64 tmp0,cc_dst
movi_i64 tmp12,$0x0
brcond_i64 tmp0,tmp12,eq,$0x0
and after (without patch on the left, with on the right):
movi_i64 tmp1,$0xff0000 movi_i64 tmp1,$0xff0000
discard cc_src discard cc_src
and_i64 cc_dst,rcx,tmp1 and_i64 cc_dst,rcx,tmp1
movi_i32 cc_op,$0x1c movi_i32 cc_op,$0x1c
ext32u_i64 tmp0,cc_dst
movi_i64 tmp12,$0x0 movi_i64 tmp12,$0x0
brcond_i64 tmp0,tmp12,eq,$0x0 brcond_i64 cc_dst,tmp12,eq,$0x0
Other similar cases: "test %eax, %eax + jne" where eax is already 32-bit
(after optimization, without patch on the left, with on the right):
discard cc_src discard cc_src
mov_i64 cc_dst,rax mov_i64 cc_dst,rax
movi_i32 cc_op,$0x1c movi_i32 cc_op,$0x1c
ext32u_i64 tmp0,cc_dst
movi_i64 tmp12,$0x0 movi_i64 tmp12,$0x0
brcond_i64 tmp0,tmp12,ne,$0x0 brcond_i64 rax,tmp12,ne,$0x0
"test $0x1, %dl + je":
movi_i64 tmp1,$0x1 movi_i64 tmp1,$0x1
discard cc_src discard cc_src
and_i64 cc_dst,rdx,tmp1 and_i64 cc_dst,rdx,tmp1
movi_i32 cc_op,$0x1a movi_i32 cc_op,$0x1a
ext8u_i64 tmp0,cc_dst
movi_i64 tmp12,$0x0 movi_i64 tmp12,$0x0
brcond_i64 tmp0,tmp12,eq,$0x0 brcond_i64 cc_dst,tmp12,eq,$0x0
In some cases TCG even outsmarts GCC. :) Here the input code has
"and $0x2,%eax + movslq %eax,%rbx + test %rbx, %rbx" and the optimizer,
thanks to copy propagation, does the following:
movi_i64 tmp12,$0x2 movi_i64 tmp12,$0x2
and_i64 rax,rax,tmp12 and_i64 rax,rax,tmp12
mov_i64 cc_dst,rax mov_i64 cc_dst,rax
ext32s_i64 tmp0,rax -> nop
mov_i64 rbx,tmp0 -> mov_i64 rbx,cc_dst
and_i64 cc_dst,rbx,rbx -> nop
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Add a "mask" field to the tcg_temp_info struct. A bit that is zero
in "mask" will always be zero in the corresponding temporary.
Zero bits in the mask can be produced from moves of immediates,
zero-extensions, ANDs with constants, shifts; they can then be
be propagated by logical operations, shifts, sign-extensions,
negations, deposit operations, and conditional moves. Other
operations will just reset the mask to all-ones, i.e. unknown.
[rth: s/target_ulong/tcg_target_ulong/]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>