xemu-project/xemu - xemu - Gitea: Git with a cup of tea

mirror of https://github.com/xemu-project/xemu.git synced 2024-12-12 22:16:19 +00:00

Author	SHA1	Message	Date
Richard Henderson	edee2579ae	tcg: Fix jit debug for x32 Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:30 -07:00
Richard Henderson	d3452f1f40	tcg: Use appropriate types in tcg_reg_alloc_call Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:30 -07:00
Richard Henderson	a05b5b9be0	tcg: Change tcg_out_ld/st offset to intptr_t Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:30 -07:00
Richard Henderson	8cfd04959a	tcg: Change tcg_gen_exit_tb argument to uintptr_t And update all users. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:30 -07:00
Richard Henderson	48bc6bab47	tcg: Use uintptr_t in TCGHelperInfo Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Richard Henderson	2ba7fae29e	tcg: Change relocation offsets to intptr_t Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Richard Henderson	2f2f244d02	tcg: Change memory offsets to intptr_t Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Richard Henderson	e2c6d1b42d	tcg: Change frame pointer offsets to intptr_t Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Richard Henderson	8b73d49f53	tcg: Define TCG_ptr properly Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Richard Henderson	d289837eef	tcg: Define TCG_TYPE_PTR properly Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Richard Henderson	78cd7b835e	tcg: Allow TCG_TARGET_REG_BITS to be specified independantly There are several hosts for which it would be useful to use the available 64-bit registers in a 32-bit pointer environment. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Richard Henderson	04d5a1da70	tcg: Change tcg_qemu_tb_exec return to uintptr_t Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Richard Henderson	b93949ef6a	tcg: Change flush_icache_range arguments to uintptr_t Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Richard Henderson	01547f7f92	tcg: Constant fold div, rem Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Richard Henderson	32f5717f07	tcg-ppc64: Implement muluh, mulsh Using these instead of mulu2 and muls2 lets us avoid having to argument overlap analysis in the backend. Normal register allocation will DTRT. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Richard Henderson	3c9a8f1756	tcg-mips: Implement mulsh, muluh With the optimization in tcg_liveness_analysis, we can avoid the MFLO when it is unused. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Richard Henderson	03271524b6	tcg: Add muluh and mulsh opcodes Use them in places where mulu2 and muls2 are used. Optimize mulx2 with dead low part to mulxh. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-09-02 09:08:29 -07:00
Stefan Weil	a32b12741b	tci: Remove function tcg_out64 (fix broken build) Commit `ac26eb69a3` added tcg_out64 to tcg/tcg.c. tcg/tci/tcg-target.c already had a nearly identical implementation which is now removed to fix a compiler error. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>	2013-09-01 19:36:16 +04:00
Richard Henderson	401c227b0a	tcg-i386: Use new return-argument ld/st helpers Discontinue the jump-around-jump-to-jump scheme, trading it for a single immediate move instruction. The two extra jumps always consume 7 bytes, whereas the immediate move is either 5 or 7 bytes depending on where the code_gen_buffer gets located. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-08-26 13:31:54 -07:00
Richard Henderson	c6f29ff096	tcg-i386: Tidy qemu_ld/st slow path Use existing stack space for arguments; don't push/pop. Use less ifdefs and more C ifs. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-08-26 13:31:53 -07:00
Richard Henderson	8023ccda07	tcg-i386: Try pc-relative lea for constant formation Use a 7 byte lea before the ultimate 10 byte movq. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-08-26 13:31:53 -07:00
Richard Henderson	ac26eb69a3	tcg-i386: Add and use tcg_out64 No point in splitting the write into 32-bit pieces. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-08-26 13:31:53 -07:00
Richard Henderson	2bb8656dad	tcg: Tidy generated code for tcg_outN Aliasing was forcing s->code_ptr to be re-read after the store. Keep the pointer in a local variable to help the compiler. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-08-26 13:31:53 -07:00
James Hogan	85711e6baf	tcg/mips: fix invalid op definition errors tcg/mips/tcg-target.h defines various operations conditionally depending upon the isa revision, however these operations are included in mips_op_defs[] unconditionally resulting in the following runtime errors if CONFIG_DEBUG_TCG is defined: Invalid op definition for movcond_i32 Invalid op definition for rotl_i32 Invalid op definition for rotr_i32 Invalid op definition for deposit_i32 Invalid op definition for bswap16_i32 Invalid op definition for bswap32_i32 tcg/tcg.c:1196: tcg fatal error Fix with ifdefs like the i386 backend does for movcond_i32. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2013-08-08 23:06:02 +02:00
Stefan Weil	5fe0d351b3	tci: Fix broken build (compiler warning caused by redefined macro BIT) The definition of macro BIT in tci/tcg-target.c now conflicts with the definition of the same macro in includes qemu/bitops.h. This conflict was triggered by a recent change in the include chain of tcg.c (probably commit `949fc82314`). Signed-off-by: Stefan Weil <sw@weilnetz.de> Message-id: 1375216883-23969-1-git-send-email-sw@weilnetz.de Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2013-07-30 18:48:21 -05:00
Richard Henderson	f290e4988d	Merge git://github.com/hw-claudio/qemu-aarch64-queue into tcg-next	2013-07-15 13:21:10 -07:00
Jani Kokkonen	c6d8ed24b4	tcg/aarch64: Implement tlb lookup fast path Supports CONFIG_QEMU_LDST_OPTIMIZATION Signed-off-by: Jani Kokkonen <jani.kokkonen@huawei.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>	2013-07-15 13:13:46 +02:00
Richard Henderson	0caa91fe1f	tcg-arm: Implement tcg_register_jit Allows unwinding past the code_gen_buffer. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:15:25 -07:00
Richard Henderson	b5cc476da7	tcg-i386: Use QEMU_BUILD_BUG_ON instead of assert for frame size We can check the condition at compile time, rather than run time. Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:15:25 -07:00
Richard Henderson	497a22eb87	tcg: Move the CIE and FDE header definitions to common code These will necessarily be the same layout for all hosts. This limits the amount of boilerplate required to implement jit debug for a host. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:15:24 -07:00
Richard Henderson	45aba097d2	tcg: Fix high_pc fields in .debug_info I don't think the debugger actually looks at this for anything, using the correct .debug_frame contents, but might as well get it all correct. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:15:24 -07:00
Richard Henderson	1e709f3833	tcg-arm: Use AT_PLATFORM to detect the host ISA With this we can generate armv7 insns even when the OS compiles for a lower common denominator. The macros are arranged so that when we do compile for a given ISA, all of the runtime checks for that ISA are optimized away. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:15:13 -07:00
Richard Henderson	cb91021a47	tcg-arm: Simplify logic in detecting the ARM ISA in use GCC 4.8 defines a handy __ARM_ARCH symbol that we can use, which will make us nicely forward compatible with ARMv8 AArch32. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:15:02 -07:00
Richard Henderson	fb82273851	tcg-arm: Rename use_armv5_instructions to use_armvt5_instructions As it really controls the availability of a thumb interworking instruction on armv5t. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:14:51 -07:00
Richard Henderson	72e1ccfc0c	tcg-arm: Make use of conditional availability of opcodes for divide We can now detect and use divide instructions at runtime, rather than having to restrict their availability to compile-time. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:14:35 -07:00
Richard Henderson	c1a61f6c85	tcg: Simplify logic using TCG_OPF_NOT_PRESENT Expand the definition of "not present" to include "should not be present". This means we can simplify the logic surrounding the generic tcg opcodes for which the host backend ought not be providing definitions. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:14:35 -07:00
Richard Henderson	4ef76952bd	tcg: Allow non-constant control macros This allows TCG_TARGET_HAS_* to be a variable rather than a constant, which allows easier support for differing ISA levels for the host. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:14:35 -07:00
Richard Henderson	5b9f72ab59	tcg-ppc64: Don't implement rem Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:14:34 -07:00
Richard Henderson	865a4671f9	tcg-ppc: Don't implement rem Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:14:34 -07:00
Richard Henderson	5e1108b370	tcg-arm: Don't implement rem Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:14:34 -07:00
Richard Henderson	ca675f46e6	tcg: Split rem requirement from div requirement There are several hosts with only a "div" insn. Remainder is computed manually from the quotient and inputs. We can do this generically. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-07-09 07:14:09 -07:00
Claudio Fontana	b1f6dc0d2a	tcg/aarch64: implement ldst 12bit scaled uimm offset implement the 12bit scaled unsigned immediate offset variant of LDR/STR. This improves code size by avoiding the movi + ldst_r for naturally aligned offsets in range. Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com> Reviewed-by: Richard Henderson <rth@twiddle.net>	2013-07-03 14:43:11 +02:00
Anton Blanchard	d1bdd3af49	tcg-ppc64: rotr_i32 rotates wrong amount rotr_i32 calculates the amount to left shift and puts it into a temporary, but then doesn't use it when doing the shift. Cc: qemu-stable@nongnu.org Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-06-17 10:42:16 -07:00
Anton Blanchard	8424735710	tcg-ppc64: Fix add2_i64 add2_i64 was adding the lower double word to the upper double word of each input. Fix this so we add the lower double words, then the upper double words with carry propagation. Cc: qemu-stable@nongnu.org Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-06-17 10:42:16 -07:00
Anton Blanchard	82e0f9170a	tcg-ppc64: bswap64 rotates output 32 bits If our input and output is in the same register, bswap64 tries to undo a rotate of the input. This just ends up rotating the output. Cc: qemu-stable@nongnu.org Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-06-17 10:42:16 -07:00
Anton Blanchard	8a94cfb05e	tcg-ppc64: Fix RLDCL opcode The rldcl instruction doesn't have an sh field, so the minor opcode is shifted 1 bit. We were using the XO30 macro which shifted the minor opcode 2 bits. Remove XO30 and add MD30 and MDS30 macros which match the Power ISA categories. Cc: qemu-stable@nongnu.org Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-06-17 10:41:52 -07:00
Anthony Liguori	86a6a07745	Merge remote-tracking branch 'pmaydell/tcg-aarch64.next' into staging # By Claudio Fontana (9) and others # Via Peter Maydell * pmaydell/tcg-aarch64.next: MAINTAINERS: add tcg/aarch64 maintainer configure: permit compilation on arm aarch64 tcg/aarch64: implement user mode qemu ld/st user-exec.c: aarch64 initial implementation of cpu_signal_handler tcg/aarch64: implement sign/zero extend operations tcg/aarch64: implement byte swap operations tcg/aarch64: implement AND/TEST immediate pattern tcg/aarch64: improve arith shifted regs operations tcg/aarch64: implement new TCG target for aarch64 include/elf.h: add aarch64 ELF machine and relocs configure: Drop CONFIG_ATFILE test linux-user: Drop direct use of openat etc syscalls linux-user: Allow getdents to be provided by getdents64 Message-id: 1371052645-9006-1-git-send-email-peter.maydell@linaro.org Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2013-06-14 07:51:05 -05:00
Jani Kokkonen	6a91c7c978	tcg/aarch64: implement user mode qemu ld/st also put aarch64 in the list of archs that do not need an ldscript. Signed-off-by: Jani Kokkoken <jani.kokkonen@huawei.com> Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 51AF40EE.1000104@huawei.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2013-06-12 16:20:23 +01:00
Claudio Fontana	31f1275b90	tcg/aarch64: implement sign/zero extend operations implement the optional sign/zero extend operations with the dedicated aarch64 instructions. Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 51AC9A58.40502@huawei.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2013-06-12 16:20:23 +01:00
Claudio Fontana	9c4a059df3	tcg/aarch64: implement byte swap operations implement the optional byte swap operations with the dedicated aarch64 instructions. Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 51AC9A33.9050003@huawei.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2013-06-12 16:20:23 +01:00
Claudio Fontana	7deea126b2	tcg/aarch64: implement AND/TEST immediate pattern add functions to AND/TEST registers with immediate patterns. Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 51AC9A0C.3090303@huawei.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2013-06-12 16:20:22 +01:00
Claudio Fontana	36fac14a64	tcg/aarch64: improve arith shifted regs operations for arith operations, add SUBS, ANDS, ADDS and add a shift parameter so that all arith instructions can make use of shifted registers. Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 51AC998B.7070506@huawei.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2013-06-12 16:20:22 +01:00
Claudio Fontana	4a136e0a6b	tcg/aarch64: implement new TCG target for aarch64 add preliminary support for TCG target aarch64. Signed-off-by: Claudio Fontana <claudio.fontana@huawei.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 51A5C596.3090108@huawei.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2013-06-12 16:20:22 +01:00
Richard Henderson	56bbc2f967	tcg: Remove redundant tcg_target_init checks We've got a compile-time check for the condition in exec/cpu-defs.h. Reviewed-by: Andreas Färber <afaerber@suse.de> Reviewed-by: liguang <lig.fnst@cn.fujitsu.com> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-06-05 05:54:40 -07:00
Aurelien Jarno	66e61b55f1	tcg/optimize: fix setcond2 optimization When setcond2 is rewritten into setcond, the state of the destination temp should be reset, so that a copy of the previous value is not used instead of the result. Reported-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2013-05-09 16:14:58 +02:00
Richard Henderson	c9e53a4cf1	tcg-arm: Use movi32 in exit_tb Avoid the mini constant pool for armv7, and avoid replicating the test for pre-v7. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2013-05-03 11:53:30 +02:00
Richard Henderson	8ddaeb1be6	tcg-arm: Fix 64-bit tlb load for pre-v6 Found by inspection, since the effect of the bug was simply to send all memory ops through the slow path. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2013-05-03 11:53:29 +02:00
Richard Henderson	96fbd7de36	tcg-arm: Remove long jump from tcg_out_goto_label Branches within a TB will always be within 16MB. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:45 +02:00
Richard Henderson	df5e0ef711	tcg-arm: Convert to CONFIG_QEMU_LDST_OPTIMIZATION Move the slow path out of line, as the TODO's mention. This allows the fast path to be unconditional, which can speed up the fast path as well, depending on the core. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:45 +02:00
Richard Henderson	302fdde73f	tcg-arm: Use movi32 + blx for calls on v7 Work better with branch predition when we have movw+movt, as the size of the code is the same. Perhaps re-evaluate when we have a proper constant pool. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:45 +02:00
Richard Henderson	595b5397cc	tcg-arm: Delete the 'S' constraint After the previous patch, 's' and 'S' are the same. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:45 +02:00
Richard Henderson	702b33b1d5	tcg-arm: Improve scheduling of tcg_out_tlb_read The schedule was fully serial, with no possibility for dual issue. The old schedule had a minimal issue of 7 cycles; the new schedule has a minimal issue of 5 cycles. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:45 +02:00
Richard Henderson	cee87be80a	tcg-arm: Split out tcg_out_tlb_read Share code between qemu_ld and qemu_st to process the tlb. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:45 +02:00
Richard Henderson	9feac1d770	tcg-arm: Cleanup most primitive load store subroutines Use even more primitive helper functions to avoid lots of duplicated code. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:44 +02:00
Richard Henderson	34358a12c8	tcg-arm: Cleanup multiply subroutines Make the code more readable by only having one copy of the magic numbers, swapping registers as needed prior to that. Speed the compiler by not applying the rd == rn avoidance for v6 or later. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:44 +02:00
Richard Henderson	13dd6fb962	tcg-arm: Use R12 for the tcg temporary R12 is call clobbered, while R8 is call saved. This change gives tcg one more call saved register for real data. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:44 +02:00
Richard Henderson	4346457a47	tcg-arm: Use TCG_REG_TMP name for the tcg temporary Don't hard-code R8. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:44 +02:00
Richard Henderson	0637c56c99	tcg-arm: Implement division instructions An armv7 extension implements division, present on Cortex A15. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:44 +02:00
Richard Henderson	b6b24cb031	tcg-arm: Implement deposit for armv7 We have BFI and BFC available for implementing it. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:44 +02:00
Richard Henderson	e86e0f2807	tcg-arm: Improve constant generation Try fully rotated arguments to mov and mvn before trying movt or full decomposition. Begin decomposition with mvn when it looks like it'll help. Examples include -: mov r9, #0x00000fa0 -: orr r9, r9, #0x000ee000 -: orr r9, r9, #0x0ff00000 -: orr r9, r9, #0xf0000000 +: mvn r9, #0x0000005f +: eor r9, r9, #0x00011000 Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:43 +02:00
Richard Henderson	2df3f1ee68	tcg-arm: Handle constant arguments to add2/sub2 We get to re-use the _rIN and _rIK subroutines to handle the various combinations of add vs sub. Fold the << 21 into the opcode enum values so that we can explicitly add TO_CPSR as desired. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:43 +02:00
Richard Henderson	5d53b4c93c	tcg-arm: Use tcg_out_dat_rIN for compares This allows us to emit CMN instructions. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:43 +02:00
Richard Henderson	d9fda57549	tcg-arm: Allow constant first argument to sub This allows the generation of RSB instructions. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:43 +02:00
Richard Henderson	a9a86ae95d	tcg-arm: Handle negated constant arguments to and/sub This greatly improves code generation for addition of small negative constants. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:43 +02:00
Richard Henderson	19b62bf414	tcg-arm: Use bic to implement and with constant This greatly improves the code we can produce for deposit without armv7 support. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:16:42 +02:00
Richard Henderson	d6b64b2b60	tcg: Log the contents of the prologue with -d out_asm This makes it easier to verify changes to the code generating the prologue. [Aurelien: change the format from %i to %zu] Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 02:15:55 +02:00
Richard Henderson	fc4d60ee16	tcg-arm: Fix local stack frame We were not allocating TCG_STATIC_CALL_ARGS_SIZE, so this meant that any helper with more than 4 arguments would clobber the saved regs. Realizing that we're supposed to have this memory pre-allocated means we can clean up the tcg_out_arg functions, which were trying to do more stack allocation. Allocate stack memory for the TCG temporaries while we're at it. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-27 01:19:20 +02:00
Aurelien Jarno	ed605126a8	tcg: fix deposit_i64 op on 32-bit targets On 32-bit TCG targets, when emulating deposit_i64 with a mov_i32 + deposit_i32, care should be taken to not overwrite the low part of the second argument before the deposit when it is the same the destination. This fixes the shld instruction in qemu-system-x86_64, which in turns fixes booting "system rescue CD version 2.8.0" on this target. Reported-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2013-04-27 01:10:18 +02:00
Richard Henderson	39dc85b985	tcg-ppc64: Handle deposit of zero The TCG optimizer does great work when inserting constants, being able to fold the open-coded deposit expansion to just an AND or an OR. Avoid a bit the regression caused by having the deposit opcode by expanding deposit of zero as an AND. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:55 +02:00
Richard Henderson	6645c147db	tcg-ppc64: Implement mulu2/muls2_i64 Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:54 +02:00
Richard Henderson	6c858762de	tcg-ppc64: Implement add2/sub2_i64 Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:54 +02:00
Richard Henderson	1e6e9aca15	tcg-ppc64: Use getauxval for ISA detection Glibc 2.16 includes an easy way to get feature bits previously buried in /proc or the program startup auxiliary vector. Use it. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:54 +02:00
Richard Henderson	027ffea972	tcg-ppc64: Implement movcond Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:54 +02:00
Richard Henderson	70fac59a2a	tcg-ppc64: Use ISEL for setcond There are a few simple special cases that should be handled first. Break these out to subroutines to avoid code duplication. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:53 +02:00
Richard Henderson	6995a4a063	tcg-ppc64: Use MFOCRF instead of MFCR It takes half the cycles to read one CR register instead of all 8. This is a backward compatible addition to the ISA, so chips prior to Power 2.00 spec will simply continue to read the entire CR register. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:53 +02:00
Richard Henderson	991041a4eb	tcg-ppc64: Cleanup i32 constants to tcg_out_cmp Nothing else in the call chain ensures that these constants don't have garbage in the high bits. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:53 +02:00
Richard Henderson	4c314da6d1	tcg-ppc64: Use TCGType throughout compares The optimization/bug being fixed is that tcg_out_cmp was not applying the right type to loading a constant, in the case it can't be implemented directly. Rather than recomputing the TCGType enum from the arch64 bool, pass around the original TCGType throughout. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:52 +02:00
Richard Henderson	ef809300fc	tcg-ppc64: Use I constraint for mul The mul_i32 pattern was loading non-16-bit constants into a register, when we can get the middle-end to do that for us. The mul_i64 pattern was not considering that MULLI takes 64-bit inputs. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:52 +02:00
Richard Henderson	33de9ed223	tcg-ppc64: Implement deposit Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:52 +02:00
Richard Henderson	37251b98db	tcg-ppc64: Handle constant inputs for some compound logicals Since we have special code to handle and/or/xor with a constant, apply the same to andc/orc/eqv with a constant. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:51 +02:00
Richard Henderson	ce1010d6e3	tcg-ppc64: Implement compound logicals Mostly copied from the ppc32 port. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:51 +02:00
Richard Henderson	68aebd45b1	tcg-ppc64: Implement bswap64 Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:51 +02:00
Richard Henderson	5d22158200	tcg-ppc64: Implement bswap16 and bswap32 Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 20:09:44 +02:00
Richard Henderson	313d91c778	tcg-ppc64: Implement rotates Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:55:38 +02:00
Richard Henderson	49d9870a54	tcg-ppc64: Streamline qemu_ld/st insn selection Using a table to look up insns of the right width and sign. Include support for the Power 2.06 LDBRX and STDBRX insns. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:55:35 +02:00
Richard Henderson	28f2dba6dc	tcg-ppc64: Use automatic implementation of ext32u_i64 The enhancements to and immediate obviate this. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:55:31 +02:00
Richard Henderson	637af30c76	tcg-ppc64: Improve and_i64 with constant Use RLDICL and RLDICR. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:55:27 +02:00
Richard Henderson	a9249dff4d	tcg-ppc64: Improve and_i32 with constant Use RLWINM Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:55:27 +02:00
Richard Henderson	dce74c57bb	tcg-ppc64: Tidy or and xor patterns. Handle constants in common code; we'll want to reuse that later. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:55:26 +02:00
Richard Henderson	148bdd2373	tcg-ppc64: Allow constant first argument to sub Using SUBFIC for 16-bit signed constants. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:55:22 +02:00
Richard Henderson	ee924fa6b3	tcg-ppc64: Improve constant add and sub ops. Improve constant addition -- previously we'd emit useless addi with 0. Use new constraints to force the driver to pull full 64-bit constants into a register. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:55:15 +02:00
Richard Henderson	3d582c6179	tcg-ppc64: Rearrange integer constant constraints We'll need a zero, and Z makes more sense for that. Make sure we have a full compliment of signed and unsigned 16 and 32-bit tests. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:52:05 +02:00
Richard Henderson	421233a146	tcg-ppc64: Cleanup tcg_out_movi The test for using movi32 was sub-optimal for TCG_TYPE_I32, comparing a signed 32-bit quantity against an unsigned 32-bit quantity. When possible, use addi+oris for 32-bit unsigned constants. Otherwise, standardize on addi+oris+ori instead of addis+ori+rldicl. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:52:04 +02:00
Richard Henderson	752c1fdb6d	tcg-ppc64: Fix setcond_i32 We weren't ignoring the high 32 bits during a NE comparison. Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:51:50 +02:00
Richard Henderson	2fd8eddcab	tcg-ppc64: Introduce and use TAI and SAI Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:44:48 +02:00
Richard Henderson	5e916c287e	tcg-ppc64: Introduce and use tcg_out_shri64 Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:44:46 +02:00
Richard Henderson	0a9564b964	tcg-ppc64: Introduce and use tcg_out_shli64 Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:44:44 +02:00
Richard Henderson	6e5e06024f	tcg-ppc64: Introduce and use tcg_out_ext32u Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:44:41 +02:00
Richard Henderson	9e555b735c	tcg-ppc64: Introduce and use tcg_out_rlw Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:44:39 +02:00
Richard Henderson	aceac8d685	tcg-ppc64: Use TCGReg everywhere Reviewed-by: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-15 19:44:37 +02:00
Aurelien Jarno	0a9c2341de	Merge branch 'tci' of git://qemu.weilnetz.de/qemu * 'tci' of git://qemu.weilnetz.de/qemu: tci: Make tcg temporaries local to tcg_qemu_tb_exec tci: Delete unused tb_ret_addr tci: Avoid code before declarations tci: Use a local variable for env tci: Use 32-bit signed offsets to loads/stores	2013-04-13 13:50:06 +02:00
Richard Henderson	ee79c356ff	tci: Make tcg temporaries local to tcg_qemu_tb_exec We're moving away from the temporaries stored in env. Make sure we can differentiate between temp stores and possibly bogus stores for extra call arguments. Move TCG_AREG0 and TCG_REG_CALL_STACK out of the way of the parameter passing registers. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off by: Stefan Weil <sw@weilnetz.de>	2013-04-11 19:58:21 +02:00
Richard Henderson	4699ca6dbf	tci: Delete unused tb_ret_addr Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off by: Stefan Weil <sw@weilnetz.de>	2013-04-11 19:58:21 +02:00
Richard Henderson	03fc0548b7	tci: Use 32-bit signed offsets to loads/stores Since the change to tcg_exit_req, the first insn of every TB is a load with a negative offset from env. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off by: Stefan Weil <sw@weilnetz.de>	2013-04-11 19:58:21 +02:00
Richard Henderson	b879f30846	tcg-s390: Fix merge error in tgen_brcond When the TCG condition codes were re-organized last year, we failed to update all of the "old-style" tests for unsigned. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-05 13:35:42 -05:00
Richard Henderson	78c9f7c5b0	tcg-s390: Use all 20 bits of the offset in tcg_out_mem This can save one insn, if the constant has any bits in 32-63 set, but no bits in 21-31 set. It never results in more insns. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-05 13:35:41 -05:00
Richard Henderson	0db921e6d8	tcg-s390: Use load-address for addition Since we're always in 64-bit mode, load address performs a full 64-bit add. Use that for 3-address addition, as well as for larger constant addends when we lack extended-immediates facility. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-05 13:35:41 -05:00
Richard Henderson	65a62a753c	tcg-s390: Cleanup argument shuffling fixme in softmmu code Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-05 13:35:41 -05:00
Richard Henderson	f0bffc2730	tcg-s390: Use risbgz for andi This is immediately usable by the tlb lookup code. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-05 13:35:41 -05:00
Richard Henderson	07ff798313	tcg-s390: Remove constraint letters for and Since we have a free temporary and can always just load the constant, we ought to do so, rather than spending the same effort constraining the const. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-05 13:35:41 -05:00
Richard Henderson	d5690ea433	tcg-s390: Implement deposit opcodes Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-05 13:35:40 -05:00
Richard Henderson	96a9f093f8	tcg-s390: Implement movcond opcodes Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-05 13:35:40 -05:00
Richard Henderson	36017dc68a	tcg-s390: Implement mulu2_i64 opcode Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-05 13:35:40 -05:00
Richard Henderson	3790b9180a	tcg-s390: Implement add2/sub2 opcodes Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-05 13:35:40 -05:00
Richard Henderson	a01fc30da4	tcg-s390: Remove useless preprocessor conditions We only support 64-bit code generation for s390x. Don't clutter the code with ifdefs that suggest otherwise. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-05 13:35:40 -05:00
Richard Henderson	a4924e8bb5	tcg-s390: Properly allocate a stack frame. Set TCG_TARGET_CALL_STACK_OFFSET properly for the abi. Allocate the standard TCG_STATIC_CALL_ARGS_SIZE. And while we're at it, allocate space for CPU_TEMP_BUF_NLONGS. Signed-off-by: Richard Henderson <rth@twiddle.net>	2013-04-05 13:35:40 -05:00
Richard Henderson	a22971f99f	tcg-s390: Fix movi The code to load the high 64 bits assumed that the insn used to load the low 64 bits zero-extended. Enforce that.	2013-04-05 13:35:39 -05:00
Aurelien Jarno	174d4d215f	tcg/mips: Implement muls2_i32 Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2013-04-01 18:49:17 +02:00
Richard Henderson	2d497542e1	tcg-optimize: Fold sub r,0,x to neg r,x Cc: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-03-23 14:31:03 +00:00
陳韋任 (Wei-Ren Chen)	294e4669a5	Use proper term in TCG README In TCG, "target" means the host architecture for which TCG generates the code. Using "guest" rather than "target" to make the document more consistent. Signed-off-by: Chen Wei-Ren <chenwj@iis.sinica.edu.tw> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-22 15:55:03 +01:00
Peter Maydell	378df4b237	Handle CPU interrupts by inline checking of a flag Fix some of the nasty TCG race conditions and crashes by implementing cpu_exit() as setting a flag which is checked at the start of each TB. This avoids crashes if a thread or signal handler calls cpu_exit() while the execution thread is itself modifying the TB graph (which may happen in system emulation mode as well as in linux-user mode with a multithreaded guest binary). This fixes the crashes seen in LP:668799; however there are another class of crashes described in LP:1098729 which stem from the fact that in linux-user with a multithreaded guest all threads will use and modify the same global TCG date structures (including the generated code buffer) without any kind of locking. This means that multithreaded guest binaries are still in the "unsupported" category. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-03-03 14:28:47 +00:00
Peter Maydell	0980011b4f	tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses Document tcg_qemu_tb_exec(). In particular, its return value is a combination of a pointer to the next translation block and some extra information in the low two bits. Provide some #defines for the values passed in these bits to improve code clarity. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-03-03 14:28:19 +00:00
Blue Swirl	07ca08bac8	tcg-sparc: fix build Fix build breakage by `803d805bce`: make tcg_out_addsub2() always available. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-03-03 08:25:50 +00:00
Peter Maydell	989b697ddd	qemu-log: default to stderr for logging output Switch the default for qemu_log logging output from "/tmp/qemu.log" to stderr. This is an incompatible change in some sense, but logging is mostly used for debugging purposes so it shouldn't affect production use. The previous behaviour can be obtained by adding "-D /tmp/qemu.log" to the command line. This change requires us to: * update all the documentation/help text (we take the opportunity to smooth out minor inconsistencies between the phrasing in linux-user/bsd-user/system help messages) * make linux-user and bsd-user defer to qemu-log for the default logging destination rather than overriding it themselves * ensure that all logfile closing is done via qemu_log_close() and that that function doesn't close stderr as well as the obvious change to the behaviour of do_qemu_set_log() when no logfile name has been specified. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Message-id: 1361901160-28729-1-git-send-email-peter.maydell@linaro.org Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2013-02-26 13:31:47 -06:00
Richard Henderson	f1fae40c61	tcg: Apply life analysis to 64-bit multiword arithmetic ops Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-23 17:25:29 +00:00
Richard Henderson	f402f38f43	tcg: Implement muls2 with mulu2 Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-23 17:25:29 +00:00
Richard Henderson	d693e14733	tcg-arm: Implement muls2_i32 We even had the encoding of smull already handy... Cc: Andrzej Zaborowski <balrogg@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-23 17:25:29 +00:00
Richard Henderson	624988a53b	tcg-i386: Implement multiword arithmetic ops Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-23 17:25:29 +00:00
Richard Henderson	f6953a7399	tcg: Implement multiword addition helpers Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-23 17:25:28 +00:00
Richard Henderson	696a8be6a0	tcg: Implement multiword multiply helpers Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-23 17:25:28 +00:00
Richard Henderson	3c51a98507	tcg: Implement a 64-bit to 32-bit extraction helper We're going to have use for this shortly in implementing other helpers. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-23 17:25:28 +00:00
Richard Henderson	4d3203fd0b	tcg: Add signed multiword multiplication operations Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-23 17:25:28 +00:00
Richard Henderson	d7156f7ce4	tcg: Add 64-bit multiword arithmetic operations Matching the 32-bit multiword arithmetic that we already have. Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-23 17:25:28 +00:00
Richard Henderson	803d805bce	tcg-sparc: Always implement 32-bit multiword ops Cc: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-23 17:25:28 +00:00
Richard Henderson	bbc863bfec	tcg-i386: Always implement 32-bit multiword ops Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-23 17:25:28 +00:00
Richard Henderson	e6a7273454	tcg: Make 32-bit multiword operations optional for 64-bit hosts Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-23 17:25:28 +00:00
Andreas Färber	be96bd3fbf	tcg/ppc: Fix build of tcg_qemu_tb_exec() Commit `0b0d3320db` (TCG: Final globals clean-up) moved code_gen_prologue but forgot to update ppc code. This broke the build on 32-bit ppc. ppc64 is unaffected. Cc: Evgeny Voevodin <evgenyvoevodin@gmail.com> Cc: Blue Swirl <blauwirbel@gmail.com> Signed-off-by: Andreas Färber <andreas.faerber@web.de> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-17 14:27:36 +00:00
Peter Maydell	24537a0191	qemu-log: Rename the public-facing cpu_set_log function to qemu_set_log Rename the public-facing function cpu_set_log to qemu_set_log. This requires us to rename the internal-only qemu_set_log() to do_qemu_set_log(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-16 10:44:44 +00:00
Evgeny Voevodin	5e5f07e08f	TCG: Move translation block variables to new context inside tcg_ctx: tb_ctx It's worth to clean-up translation blocks variables and move them into one context as was suggested by Swirl. Also if we use this context directly inside tcg_ctx, then it speeds up code generation a bit. Signed-off-by: Evgeny Voevodin <evgenyvoevodin@gmail.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-16 10:41:16 +00:00
Evgeny Voevodin	0b0d3320db	TCG: Final globals clean-up Signed-off-by: Evgeny Voevodin <evgenyvoevodin@gmail.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-02-16 10:40:56 +00:00
Peter Maydell	5256a7208a	tcg/target-arm: Add missing parens to assertions Silence a (legitimate) complaint about missing parentheses: tcg/arm/tcg-target.c: In function ‘tcg_out_qemu_ld’: tcg/arm/tcg-target.c:1148:5: error: suggest parentheses around comparison in operand of ‘&’ [-Werror=parentheses] tcg/arm/tcg-target.c: In function ‘tcg_out_qemu_st’: tcg/arm/tcg-target.c:1357:5: error: suggest parentheses around comparison in operand of ‘&’ [-Werror=parentheses] which meant that we would mistakenly always assert if running a QEMU built with debug enabled on ARM. Signed-off-by: Peter Maydell <peter.maydelL@linaro.org> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-01-19 10:27:45 +00:00
Paolo Bonzini	633f650254	optimize: optimize using nonzero bits This adds two optimizations using the non-zero bit mask. In some cases involving shifts or ANDs the value can become zero, and can thus be optimized to a move of zero. Second, useless zero-extension or an AND with constant can be detected that would only zero bits that are already zero. The main advantage of this optimization is that it turns zero-extensions into moves, thus enabling much better copy propagation (around 1% code reduction). Here is for example a "test $0xff0000,%ecx + je" before optimization: mov_i64 tmp0,rcx movi_i64 tmp1,$0xff0000 discard cc_src and_i64 cc_dst,tmp0,tmp1 movi_i32 cc_op,$0x1c ext32u_i64 tmp0,cc_dst movi_i64 tmp12,$0x0 brcond_i64 tmp0,tmp12,eq,$0x0 and after (without patch on the left, with on the right): movi_i64 tmp1,$0xff0000 movi_i64 tmp1,$0xff0000 discard cc_src discard cc_src and_i64 cc_dst,rcx,tmp1 and_i64 cc_dst,rcx,tmp1 movi_i32 cc_op,$0x1c movi_i32 cc_op,$0x1c ext32u_i64 tmp0,cc_dst movi_i64 tmp12,$0x0 movi_i64 tmp12,$0x0 brcond_i64 tmp0,tmp12,eq,$0x0 brcond_i64 cc_dst,tmp12,eq,$0x0 Other similar cases: "test %eax, %eax + jne" where eax is already 32-bit (after optimization, without patch on the left, with on the right): discard cc_src discard cc_src mov_i64 cc_dst,rax mov_i64 cc_dst,rax movi_i32 cc_op,$0x1c movi_i32 cc_op,$0x1c ext32u_i64 tmp0,cc_dst movi_i64 tmp12,$0x0 movi_i64 tmp12,$0x0 brcond_i64 tmp0,tmp12,ne,$0x0 brcond_i64 rax,tmp12,ne,$0x0 "test $0x1, %dl + je": movi_i64 tmp1,$0x1 movi_i64 tmp1,$0x1 discard cc_src discard cc_src and_i64 cc_dst,rdx,tmp1 and_i64 cc_dst,rdx,tmp1 movi_i32 cc_op,$0x1a movi_i32 cc_op,$0x1a ext8u_i64 tmp0,cc_dst movi_i64 tmp12,$0x0 movi_i64 tmp12,$0x0 brcond_i64 tmp0,tmp12,eq,$0x0 brcond_i64 cc_dst,tmp12,eq,$0x0 In some cases TCG even outsmarts GCC. :) Here the input code has "and $0x2,%eax + movslq %eax,%rbx + test %rbx, %rbx" and the optimizer, thanks to copy propagation, does the following: movi_i64 tmp12,$0x2 movi_i64 tmp12,$0x2 and_i64 rax,rax,tmp12 and_i64 rax,rax,tmp12 mov_i64 cc_dst,rax mov_i64 cc_dst,rax ext32s_i64 tmp0,rax -> nop mov_i64 rbx,tmp0 -> mov_i64 rbx,cc_dst and_i64 cc_dst,rbx,rbx -> nop Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-01-19 10:13:16 +00:00
Paolo Bonzini	3a9d8b179b	optimize: track nonzero bits of registers Add a "mask" field to the tcg_temp_info struct. A bit that is zero in "mask" will always be zero in the corresponding temporary. Zero bits in the mask can be produced from moves of immediates, zero-extensions, ANDs with constants, shifts; they can then be be propagated by logical operations, shifts, sign-extensions, negations, deposit operations, and conditional moves. Other operations will just reset the mask to all-ones, i.e. unknown. [rth: s/target_ulong/tcg_target_ulong/] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-01-19 10:13:14 +00:00
Paolo Bonzini	d193a14a2c	optimize: only write to state when clearing optimizer data The next patch will add to the TCG optimizer a field that should be non-zero in the default case. Thus, replace the memset of the temps array with a loop. Only the state field has to be up-to-date, because others are not used except if the state is TCG_TEMP_COPY or TCG_TEMP_CONST. [rth: Extracted the loop to a function.] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-01-19 10:13:13 +00:00
Paolo Bonzini	163fa4b09d	tcg-i386: use LEA for 3-operand 64-bit addition Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-01-12 12:45:56 +00:00
Stefan Weil	9a8a5ae69d	tcg: Remove unneeded assertion Commit `7f6f0ae5b9` added two assertions. One of these assertions is not needed: The pointer ts is never NULL because it is initialized with the address of an array element. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2013-01-02 11:23:21 -06:00
Richard Henderson	753d99d38b	tcg-hppa: Fix typo in brcond2 Reported-by: Stuart Brady <sdb@zubnet.me.uk> Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-12-29 12:21:53 +00:00
Richard Henderson	76a347e1cd	tcg-i386: Perform cmov detection at runtime for 32-bit. Existing compile-time detection is spotty at best. Convert it all to runtime detection instead. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-12-29 12:21:16 +00:00
Richard Henderson	afcb92beac	tcg: Add TCGV_IS_UNUSED_* Cc: Aurelien Jarno <aurelien@aurel32.net> Signed-off-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Andreas Färber <afaerber@suse.de> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-12-29 12:14:07 +00:00
Paolo Bonzini	1de7afc984	misc: move include files to include/qemu/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:32:39 +01:00
Paolo Bonzini	022c62cbbc	exec: move include files to include/exec/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:31:31 +01:00
Paolo Bonzini	cb9c377f54	janitor: add guards to headers Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:31:31 +01:00
Anthony Liguori	7c12fd9b29	Merge remote-tracking branch 'stefanha/trivial-patches' into staging * stefanha/trivial-patches: pc_sysfw: Plug memory leak on pc_fw_add_pflash_drv() error path qemu-options: Fix space at EOL Fix spelling in comments and documentation Clean up pci_drive_hot_add()'s use of BlockInterfaceType arm: a9mpcore: remove un-used ptimer_iomem field target-sparc: Remove t0, t1 from CPUSPARCState target-m68k: Remove t1 from CPUM68KState target-alpha: Remove t0, t1 from CPUAlphaState s390x: Spelling fixes (endianess -> endianness, occured -> occurred) Fix comments (adress -> address, layed -> laid, wierd -> weird) Fix spelling (prefered -> preferred) configure: Remove stray debug output sd: Send debug printfery to stderr not stdout Conflicts: configure Resolve spelling conflict in configure. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2012-12-10 08:34:29 -06:00
Evgeny Voevodin	c3a43607d9	tcg/tcg.h: Duplicate global TCG gen_opc_ arrays into TCGContext. Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-12-08 14:24:41 +00:00
Stefan Weil	a93cf9dfba	Fix comments (adress -> address, layed -> laid, wierd -> weird) Remove also a duplicated 'the'. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-12-07 12:34:11 +01:00
Aurelien Jarno	e5138db510	tcg: mark local temps as MEM in dead_temp() In dead_temp, local temps should always be marked as back to memory, even if they have not been allocated (i.e. they are discared before cross a basic block). It fixes the following assertion in target-xtensa: qemu-system-xtensa: tcg/tcg.c:1665: temp_save: Assertion `s->temps[temp].val_type == 2 \|\| s->temps[temp].fixed_reg' failed. Aborted Reported-by: Max Filippov <jcmvbkbc@gmail.com> Tested-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-11-24 13:24:13 +01:00
Aurelien Jarno	7aab08aa78	tcg/arm: fix cross-endian qemu_st16 The bswap16 TCG opcode assumes that the high bytes of the temp equal to 0 before calling it. The ARM backend implementation takes this assumption to slightly optimize the generated code. The same implementation is called for implementing the cross-endian qemu_st16 opcode, where this assumption is not true anymore. One way to fix that would be to zero the high bytes before calling it. Given the store instruction just ignore them, it is possible to provide a slightly more optimized version. With ARMv6+ the rev16 instruction does the work correctly. For lower ARM versions the patch provides a version which behaves correctly with non-zero high bytes, but fill them with junk. Cc: Andrzej Zaborowski <balrogg@gmail.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: qemu-stable@nongnu.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-11-24 13:19:53 +01:00
Aurelien Jarno	d17bd1d8cc	tcg/arm: fix TLB access in qemu-ld/st ops The TCG arm backend considers likely that the offset to the TLB entries does not exceed 12 bits for mem_index = 0. In practice this is not true for at least the MIPS target. The current patch fixes that by loading the bits 23-12 with a separate instruction, and using loads with address writeback, independently of the value of mem_idx. In total this allow a 24-bit offset, which is a lot more than needed. Cc: Andrzej Zaborowski <balrogg@gmail.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: qemu-stable@nongnu.org Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-11-24 13:19:53 +01:00
malc	ecf51c9abe	tcg/ppc: Fix !softmmu case Signed-off-by: malc <av1474@comtv.ru>	2012-11-21 10:56:22 +04:00
malc	ecdffbccd7	tcg/ppc: Remove unused s_bits variable Thanks to Alexander Graf for heads up. Signed-off-by: malc <av1474@comtv.ru>	2012-11-19 22:22:24 +04:00
Stefan Weil	e24dc9feb0	tci: Support deposit operations The operations for INDEX_op_deposit_i32 and INDEX_op_deposit_i64 are now supported and enabled by default. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-11-18 20:40:08 +00:00
Evgeny Voevodin	83eeb39669	TCG: Remove unused global variables Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-11-17 13:53:38 +00:00
Evgeny Voevodin	1ff0a2c594	TCG: Use gen_opparam_buf from context instead of global variable. Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-11-17 13:53:37 +00:00
Evgeny Voevodin	92414b31e7	TCG: Use gen_opc_buf from context instead of global variable. Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-11-17 13:53:36 +00:00
Evgeny Voevodin	c4afe5c4d3	TCG: Use gen_opparam_ptr from context instead of global variable. Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-11-17 13:53:34 +00:00
Evgeny Voevodin	efd7f48600	TCG: Use gen_opc_ptr from context instead of global variable. Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-11-17 13:53:27 +00:00
Evgeny Voevodin	8232a46a16	tcg/tcg.h: Duplicate global TCG variables in TCGContext Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-11-17 13:53:26 +00:00
Kirill Batuzov	3c5645fab3	tcg: properly check that op's output needs to be synced to memory Fix typo introduced in `b3a1be87ba`. Reported-by: Ruslan Savchenko <ruslan.savchenko@gmail.com> Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-11-11 16:06:46 +01:00
malc	c878da3b27	tcg/ppc32: Use trampolines to trim the code size for mmu slow path accessors mmu access looks something like: <check tlb> if miss goto slow_path <fast path> done: ... ; end of the TB slow_path: <pre process> mr r3, r27 ; move areg0 to r3 ; (r3 holds the first argument for all the PPC32 ABIs) <call mmu_helper> b $+8 .long done <post process> b done On ppc32 <call mmu_helper> is: (SysV and Darwin) mmu_helper is most likely not within direct branching distance from the call site, necessitating a. moving 32 bit offset of mmu_helper into a GPR ; 8 bytes b. moving GPR to CTR/LR ; 4 bytes c. (finally) branching to CTR/LR ; 4 bytes r3 setting - 4 bytes call - 16 bytes dummy jump over retaddr - 4 bytes embedded retaddr - 4 bytes Total overhead - 28 bytes (PowerOpen (AIX)) a. moving 32 bit offset of mmu_helper's TOC into a GPR1 ; 8 bytes b. loading 32 bit function pointer into GPR2 ; 4 bytes c. moving GPR2 to CTR/LR ; 4 bytes d. loading 32 bit small area pointer into R2 ; 4 bytes e. (finally) branching to CTR/LR ; 4 bytes r3 setting - 4 bytes call - 24 bytes dummy jump over retaddr - 4 bytes embedded retaddr - 4 bytes Total overhead - 36 bytes Following is done to trim the code size of slow path sections: In tcg_target_qemu_prologue trampolines are emitted that look like this: trampoline: mfspr r3, LR addi r3, 4 mtspr LR, r3 ; fixup LR to point over embedded retaddr mr r3, r27 <jump mmu_helper> ; tail call of sorts And slow path becomes: slow_path: <pre process> <call trampoline> .long done <post process> b done call - 4 bytes (trampoline is within code gen buffer and most likely accessible via direct branch) embedded retaddr - 4 bytes Total overhead - 8 bytes In the end the icache pressure is decreased by 20/28 bytes at the cost of an extra jump to trampoline and adjusting LR (to skip over embedded retaddr) once inside. Signed-off-by: malc <av1474@comtv.ru>	2012-11-06 04:37:57 +04:00
malc	ed224a56b3	tcg/ppc: ld/st optimization Signed-off-by: malc <av1474@comtv.ru>	2012-11-03 20:17:54 +04:00
Yeongkyoon Lee	b76f0d8c2e	tcg: Optimize qemu_ld/st by generating slow paths at the end of a block Add optimized TCG qemu_ld/st generation which locates the code of TLB miss cases at the end of a block after generating the other IRs. Currently, this optimization supports only i386 and x86_64 hosts. Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2012-11-03 09:44:21 +00:00
Aurelien Jarno	b3a1be87ba	tcg: don't remove op if output needs to be synced to memory Commit `9c43b68de6` do not correctly check for dead outputs when they need to be synced to memory in case of half-dead operations. Fix that by applying the same pattern than for the default case. Tested-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-31 22:20:45 +01:00
Aurelien Jarno	3585317f6f	tcg/mips: use MUL instead of MULT on MIPS32 and above MIPS32 and later instruction sets have a multiplication instruction directly operating on GPRs. It only produces a 32-bit result but it is exactly what is needed by QEMU. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-30 00:34:48 +01:00
Richard Henderson	44b37ace06	tcg-i386: Use %gs prefixes for x86_64 GUEST_BASE When we allocate a reserved_va for the guest, the kernel will likely choose an address well above 4G. At which point we must use a pair of movabsq+addq to form the host address. If we have OS support, set up a segment register to point to guest_base instead. Signed-off-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:25 +01:00
Aurelien Jarno	b393ab4228	tcg: remove compatiblity call flags Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:24 +01:00
Aurelien Jarno	7850527966	tcg: rework TCG helper flags The current helper flags, TCG_CALL_CONST and TCG_CALL_PURE might be confusing and doesn't provide enough granularity for some helpers (FP helpers for example). This patch changes them into the following helpers flags: - TCG_CALL_NO_READ_GLOBALS means that the helper does not read globals, either directly or via an exception. They will not be saved to their canonical location before calling the helper. - TCG_CALL_NO_WRITE_GLOBALS means that the helper does not modify any globals. They will only be saved to their canonical locations before calling helpers, but they won't be reloaded afterwise. - TCG_CALL_NO_SIDE_EFFECTS means that the call to the function is removed if the return value is not used. It provides convenience flags, to avoid helper definitions longer than 80 characters. It also provides compatibility flags, and updates the documentation. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:23 +01:00
Aurelien Jarno	3d5c5f876d	tcg: synchronize globals for ops with side effects Operations with side effects (in practice qemu_ld/st ops), only need to synchronize globals to make sure the CPU state is consistent in case of exception. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:22 +01:00
Aurelien Jarno	b202d41ee7	tcg: forbid ld/st function to modify globals Mapping a memory address using a global and accessing it through ld/st operations is currently broken. As it doesn't make any sense to do that performance wise, let's forbid that. Update the TCG documentation, and remove partial support for that. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:22 +01:00
Aurelien Jarno	344028ba0f	tcg: fix some op flags Some branch related ops are marked with TCG_OPF_SIDE_EFFECTS, some other not. In practice they don't need to, as they are all marked with TCG_OPF_BB_END, which is handled specifically in all the code. The call op is marked as TCG_OPF_SIDE_EFFECTS, which might be not true as there is are specific flags (TCG_CALL_CONST and TCG_CALL_PURE) for specifying that. On the other hand it always clobber arguments, so mark it as such even if the call op is handled in a different code path. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:22 +01:00
Aurelien Jarno	2c0366f036	tcg: don't explicitly save globals and temps The liveness analysis ensures that globals and temps are at the correct state at a basic block end or with an op with side effects. Avoid looping on all temps, this can be time consuming on targets with a lot of globals. Keep an assert in debug mode. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:22 +01:00
Aurelien Jarno	7dfd8c6aa1	tcg: start with local temps in TEMP_VAL_MEM state Start with local temps in TEMP_VAL_MEM state, to make possible a later check that all the temps are correctly saved back to memory. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:22 +01:00
Aurelien Jarno	a52ad07e7c	tcg: always mark dead input arguments as dead Always mark dead input arguments as dead, even if the op is at the basic block end. This will allow to check that all temps are correctly saved. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:22 +01:00
Aurelien Jarno	c29c1d7edf	tcg: rewrite tcg_reg_alloc_mov() Now that the liveness analysis provides more information, rewrite tcg_reg_alloc_mov(). This changes the behaviour about propagating constants and memory accesses. We now take the assumption that once a value is loaded into a register (from memory or from a constant), it's better to keep it there than to reload it later. This assumption is now always almost correct given that we are now sure the corresponding temp is going to be used later (otherwise it would have been synchronized and marked as dead already). The assumption is wrong if one of the op after clobbers some registers including the one of the holding the temp (this can be avoided by allocating clobbered registers last, which is what most TCG target do), or in case of lack of available register. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:22 +01:00
Aurelien Jarno	4c4e1ab26b	tcg: improve tcg_reg_alloc_movi() Now that the liveness analysis might mark some output temps as dead, call temp_dead() if needed. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:21 +01:00
Aurelien Jarno	9c43b68de6	tcg: rework liveness analysis Rework the liveness analysis by tracking temps that need to go back to memory in addition to dead temps tracking. This allows to mark output arguments as "need sync", and to synchronize them back to memory as soon as they are not written anymore. This way even arguments mapping to globals can be marked as "dead", avoiding moves to a new register when input and outputs are aliased. In addition it means that registers are freed as soon as temps are not used anymore, instead of waiting for a basic block end or an op with side effects. This reduces register spilling especially on CPUs with few registers, and spread the mov over all the TB, increasing the performances on in-order CPUs. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:21 +01:00
Aurelien Jarno	ec7a869d31	tcg: sync output arguments on liveness request Synchronize an output argument when requested by the liveness analysis. This is needed so that the temp can be declared dead later. For that, add a new op_sync_args table in which each bit tells if the corresponding output argument needs to be synchronized with the memory. Pass it to the tcg_reg_alloc_* functions, and honor this bit. We need to synchronize the argument before marking it as dead, and we have to make sure all the infos about the temp are correctly filled. At the same time change some types from unsigned int to uint16_t when passing op_dead_args. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:21 +01:00
Aurelien Jarno	1ad80729be	tcg: add temp_sync() Add a new function temp_sync() to synchronize the canonical location of a temp with the value in the corresponding register, but without freeing the associated register. Rewrite temp_save() to call temp_sync() followed by temp_dead(). Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:21 +01:00
Aurelien Jarno	7f6ceedf9c	tcg: add tcg_reg_sync() Add a new function tcg_reg_sync() to synchronize the canonical location of a temp with the value in the associated register, but without freeing it. Rewrite tcg_reg_free() to first call tcg_reg_sync() and then to free the register. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:21 +01:00
Aurelien Jarno	639368dd68	tcg: add temp_dead() A lot of code is duplicated to mark a temporary as dead. Replace it by temp_dead(), which in addition marks the temp as saved in memory for globals and local temps, instead of doing this a posteriori in temp_save(). Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:21 +01:00
Aurelien Jarno	17b914912d	tcg/i386: remove ld/st third argument register constraint On x86_64, remove the constraint on the third argument register which is not needed: - For loads the helper arguments are env, addr, mem_idx. The addr value should not be in the two first argument registers as they are used in tcg_out_tlb_load(). - For stores the helper arguments are env, addr, data, mem_idx. The addr and data values should not be in the two first argument registers as they are used in tcg_out_tlb_load(). The data value should also not be in the two first argument registers, but could be in the third argument register in which case it would be already loaded at the right location. Reviewed-by: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>	2012-10-28 14:54:15 +01:00

... 2 3 4 5 6 ...

1051 Commits