xemu-project/xemu - xemu - Gitea: Git with a cup of tea

mirror of https://github.com/xemu-project/xemu.git synced 2024-11-24 03:59:52 +00:00

Author	SHA1	Message	Date
Emilio G. Cota	d9fe9db943	hardfloat: implement float32/64 comparison Performance results for fp-bench: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: cmp-single: 110.98 MFlops cmp-double: 107.12 MFlops - after: cmp-single: 506.28 MFlops cmp-double: 524.77 MFlops Note that flattening both eq and eq_signaling versions would give us extra performance (695v506, 615v524 Mflops for single/double, respectively) but this would emit two essentially identical functions for each eq/signaling pair, which is a waste. Aggregate performance improvement for the last few patches: [ all charts in png: https://imgur.com/a/4yV8p ] 1. Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz qemu-aarch64 NBench score; higher is better Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz 16 +-+-----------+-------------+----===-------+---===-------+-----------+-+ 14 +-+..........................@@@&&.=.......@@@&&.=...................+-+ 12 +-+..........................@.@.&.=.......@.@.&.=.....+befor=== +-+ 10 +-+..........................@.@.&.=.......@.@.&.=.....+ad@@&& = +-+ 8 +-+.......................$$$%.@.&.=.......@.@.&.=.....+ @@u& = +-+ 6 +-+............@@@&&=+*##.$%.@.&.=##$$%+@.&.=..###$$%%@i& = +-+ 4 +-+.......###$%%.@.&=...#.$%.@.&.=..#.$%.@.&.=+.#+$ +@m& = +-+ 2 +-+.....*.#$.%.@.&=...#.$%.@.&.=..#.$%.@.&.=..#+$+sqr& = +-+ 0 +-+-----##$%%@@&&=-##$$%@@&&==##$$%@@&&==-##$$%+cmp==-----+-+ FOURIER NEURAL NELU DECOMPOSITION gmean qemu-aarch64 SPEC06fp (test set) speedup over QEMU `4c2c101590` Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz error bars: 95% confidence interval 4.5 +-+---+-----+----+-----+-----+-&---+-----+----+-----+-----+-----+----+-----+-----+-----+-----+----+-----+---+-+ 4 +-+..........................+@@+...........................................................................+-+ 3.5 +-+..............%%@&.........@@..............%%@&............................................+++dsub +-+ 2.5 +-+....&&+.......%%@&.......+%%@..+%%&+..@@&+.%%@&....................................+%%&+.+%@&++%%@& +-+ 2 +-+..+%%&..+%@&+.%%@&...+++..%%@...%%&.+$$@&..%%@&..%%@&.......+%%&+.%%@&+......+%%@&.+%%&++$$@&++d%@& %%@&+-+ 1.5 +-+#$%&#$@&#%@&$%@#$%@#$%&#$@&$%@&#$%@#$%@#$%&#%@&$%@&#$%@#$%&#$@&+f%@&$%@&+-+ 0.5 +-+#$%&#$@&#%@&$%@#$%@#$%&#$@&$%@&#$%@#$%@#$%&#%@&$%@&#$%@#$%&#$@&+sqr@&$%@&+-+ 0 +-+#$%&#$@&#%@&$%@#$%@#$%&#$@&$%@&#$%@#$%@#$%&#%@&$%@&#$%@#$%&#$@&+cmp&$%@&+-+ 410.bw416.gam433.434.z435.436.cac437.lesli444.447.de450.so453454.ca459.GemsF465.tont470.lb4482.sphinxgeomean 2. Host: ARM Aarch64 A57 @ 2.4GHz qemu-aarch64 NBench score; higher is better Host: Applied Micro X-Gene, Aarch64 A57 @ 2.4 GHz 5 +-+-----------+-------------+-------------+-------------+-----------+-+ 4.5 +-+........................................@@@&==...................+-+ 3 4 +-+..........................@@@&==........@.@&.=.....+before +-+ 3 +-+..........................@.@&.=........@.@&.=.....+ad@@@&== +-+ 2.5 +-+.....................##$$%%.@&.=........@.@&.=.....+ @m@& = +-+ 2 +-+............@@@&==.#.$.%.@&.=.#$$%%.@&.=.#$$%%d@& = +-+ 1.5 +-+.....*#$$%%.@&.=..#.$.%.@&.=..#.$.%.@&.=..#+$ +f@& = +-+ 0.5 +-+......#.$.%.@&.=..#.$.%.@&.=..#.$.%.@&.=..#+$+sqr& = +-+ 0 +-+-----#$$%%@@&==-#$$%%@@&==-#$$%%@@&==-*#$$%+cmp==-----+-+ FOURIER NEURAL NLU DECOMPOSITION gmean Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	f131bae8a7	hardfloat: implement float32/64 square root Performance results for fp-bench: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: sqrt-single: 42.30 MFlops sqrt-double: 22.97 MFlops - after: sqrt-single: 311.42 MFlops sqrt-double: 311.08 MFlops Here USE_FP makes a huge difference for f64's, with throughput going from ~200 MFlops to ~300 MFlops. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	ccf770ba73	hardfloat: implement float32/64 fused multiply-add Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: fma-single: 74.73 MFlops fma-double: 74.54 MFlops - after: fma-single: 203.37 MFlops fma-double: 169.37 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: fma-single: 23.24 MFlops fma-double: 23.70 MFlops - after: fma-single: 66.14 MFlops fma-double: 63.10 MFlops 3. IBM POWER8E @ 2.1 GHz - before: fma-single: 37.26 MFlops fma-double: 37.29 MFlops - after: fma-single: 48.90 MFlops fma-double: 59.51 MFlops Here having 3FP64 set to 1 pays off for x86_64: [1] 170.15 vs [0] 153.12 MFlops Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	4a6295613f	hardfloat: implement float32/64 division Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: div-single: 34.84 MFlops div-double: 34.04 MFlops - after: div-single: 275.23 MFlops div-double: 216.38 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: div-single: 9.33 MFlops div-double: 9.30 MFlops - after: div-single: 51.55 MFlops div-double: 15.09 MFlops 3. IBM POWER8E @ 2.1 GHz - before: div-single: 25.65 MFlops div-double: 24.91 MFlops - after: div-single: 96.83 MFlops div-double: 31.01 MFlops Here setting 2FP64_USE_FP to 1 pays off for x86_64: [1] 215.97 vs [0] 62.15 MFlops Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	2dfabc86e6	hardfloat: implement float32/64 multiplication Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: mul-single: 126.91 MFlops mul-double: 118.28 MFlops - after: mul-single: 258.02 MFlops mul-double: 197.96 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: mul-single: 37.42 MFlops mul-double: 38.77 MFlops - after: mul-single: 73.41 MFlops mul-double: 76.93 MFlops 3. IBM POWER8E @ 2.1 GHz - before: mul-single: 58.40 MFlops mul-double: 59.33 MFlops - after: mul-single: 60.25 MFlops mul-double: 94.79 MFlops Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	1b615d4820	hardfloat: implement float32/64 addition and subtraction Performance results (single and double precision) for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: add-single: 135.07 MFlops add-double: 131.60 MFlops sub-single: 130.04 MFlops sub-double: 133.01 MFlops - after: add-single: 443.04 MFlops add-double: 301.95 MFlops sub-single: 411.36 MFlops sub-double: 293.15 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: add-single: 44.79 MFlops add-double: 49.20 MFlops sub-single: 44.55 MFlops sub-double: 49.06 MFlops - after: add-single: 93.28 MFlops add-double: 88.27 MFlops sub-single: 91.47 MFlops sub-double: 88.27 MFlops 3. IBM POWER8E @ 2.1 GHz - before: add-single: 72.59 MFlops add-double: 72.27 MFlops sub-single: 75.33 MFlops sub-double: 70.54 MFlops - after: add-single: 112.95 MFlops add-double: 201.11 MFlops sub-single: 116.80 MFlops sub-double: 188.72 MFlops Note that the IBM and ARM machines benefit from having HARDFLOAT_2F{32,64}_USE_FP set to 0. Otherwise their performance can suffer significantly: - IBM Power8: add-single: [1] 54.94 vs [0] 116.37 MFlops add-double: [1] 58.92 vs [0] 201.44 MFlops - Aarch64 A57: add-single: [1] 80.72 vs [0] 93.24 MFlops add-double: [1] 82.10 vs [0] 88.18 MFlops On the Intel machine, having 2F64 set to 1 pays off, but it doesn't for 2F32: - Intel i7-6700K: add-single: [1] 285.79 vs [0] 426.70 MFlops add-double: [1] 302.15 vs [0] 278.82 MFlops Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	a94b783952	fpu: introduce hardfloat The appended paves the way for leveraging the host FPU for a subset of guest FP operations. For most guest workloads (e.g. FP flags aren't ever cleared, inexact occurs often and rounding is set to the default [to nearest]) this will yield sizable performance speedups. The approach followed here avoids checking the FP exception flags register. See the added comment for details. This assumes that QEMU is running on an IEEE754-compliant FPU and that the rounding is set to the default (to nearest). The implementation-dependent specifics of the FPU should not matter; things like tininess detection and snan representation are still dealt with in soft-fp. However, this approach will break on most hosts if we compile QEMU with flags that break IEEE compatibility. There is no way to detect all of these flags at compilation time, but at least we check for -ffast-math (which defines __FAST_MATH__) and disable hardfloat (plus emit a #warning) when it is set. This patch just adds common code. Some operations will be migrated to hardfloat in subsequent patches to ease bisection. Note: some architectures (at least PPC, there might be others) clear the status flags passed to softfloat before most FP operations. This precludes the use of hardfloat, so to avoid introducing a performance regression for those targets, we add a flag to disable hardfloat. In the long run though it would be good to fix the targets so that at least the inexact flag passed to softfloat is indeed sticky. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	f9943c7f76	softfloat: rename canonicalize to sf_canonicalize glibc >= 2.25 defines canonicalize in commit eaf5ad0 (Add canonicalize, canonicalizef, canonicalizel., 2016-10-26). Given that we'll be including <math.h> soon, prepare for this by prefixing our canonicalize() with sf_ to avoid clashing with the libc's canonicalize(). Reported-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Thomas Huth	97ff87c0ed	qemu/compiler: Wrap __attribute__((flatten)) in a macro Older versions of Clang (before 3.5) and GCC (before 4.1) do not support the "__attribute__((flatten))" yet. We don't care about such old versions of GCC anymore, but since Clang 3.4 is still used in EPEL for RHEL7 / CentOS 7, we should not use this attribute directly but with a wrapper macro instead. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-10-17 08:36:28 +02:00
Richard Henderson	5dfbc9e490	softfloat: Fix division The __udiv_qrnnd primitive that we nicked from gmp requires its inputs to be normalized. We were not doing that. Because the inputs are nearly normalized already, finishing that is trivial. Replace div128to64 with a "proper" udiv_qrnnd, so that this remains a reusable primitive. Fixes: `cf07323d49` Fixes: https://bugs.launchpad.net/qemu/+bug/1793119 Tested-by: Emilio G. Cota <cota@braap.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-05 12:57:41 -05:00
Thomas Huth	0019d5c3a1	softfloat: Replace countLeadingZeros32/64 with clz32/64 Our minimum required compiler for compiling QEMU is GCC 4.1 these days, so we can drop the support for compilers which do not provide the __builtin_clz*() functions yet. Since the countLeadingZeros32/64 are then identical to the clz32/64 functions, and we do not have to sync the softloat 2 codebase with upstream anymore (softloat 3 is a complete rewrite) we can simply replace the functions with our QEMU versions. Suggested-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <1538118095-7003-1-git-send-email-thuth@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-05 12:57:41 -05:00
Emilio G. Cota	c953da8f0b	softfloat: remove float64_trunc_to_int It has not had users since `f83311e476` ("target-m68k: use floatx80 internally", 2017-06-21). Note that no other bit-width has floatX_trunc_to_int. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-05 12:57:41 -05:00
Richard Henderson	2f6c74be59	softfloat: Add scaling float-to-int routines Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180814002653.12828-3-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-08-24 13:17:30 +01:00
Richard Henderson	2abdfe2440	softfloat: Add scaling int-to-float routines Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180814002653.12828-2-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-08-24 13:17:29 +01:00
Richard Henderson	64d450a0ea	softfloat: Fix missing inexact for floating-point add For 0x1.0000000000003p+0 + 0x1.ffffffep+14 = 0x1.0001fffp+15 we dropped the sticky bit and so failed to raise inexact. Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com> Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com> Message-id: 20180810193129.1556-7-richard.henderson@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-08-16 14:29:58 +01:00
Richard Henderson	377ed92679	fpu/softfloat: Define floatN_silence_nan in terms of parts_silence_nan Isolate the target-specific choice to 3 functions instead of 6. The code in floatx80_default_nan tried to be over-general. There are only two targets that support this format: x86 and m68k. Thus there is no point in inventing a mechanism for snan_bit_is_one. Move routines that no longer have ifdefs out of softfloat-specialize.h. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:27:15 -07:00
Richard Henderson	0218a16e54	fpu/softfloat: Define floatN_default_nan in terms of parts_default_nan Isolate the target-specific choice to 2 functions instead of 6. The code in float16_default_nan was only correct for ARM, MIPS, and X86. Though float16 support is rare among our targets. The code in float128_default_nan was arguably wrong for Sparc. While QEMU supports the Sparc 128-bit insns, no real cpu enables it. The code in floatx80_default_nan tried to be over-general. There are only two targets that support this format: x86 and m68k. Thus there is no point in inventing a value for snan_bit_is_one. Move routines that no longer have ifdefs out of softfloat-specialize.h. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:27:15 -07:00
Richard Henderson	3bd2dec1a1	fpu/softfloat: Pass FloatClass to pickNaNMulAdd For each operand, pass a single enumeration instead of a pair of booleans. The commit also merges multiple different ifdef-selected implementations of pickNaNMulAdd into a single function whose body is ifdef-selected. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:27:15 -07:00
Richard Henderson	4f251cfd52	fpu/softfloat: Pass FloatClass to pickNaN For each operand, pass a single enumeration instead of a pair of booleans. The commit also merges multiple different ifdef-selected implementations of pickNaN into a single function whose body is ifdef-selected. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:27:15 -07:00
Richard Henderson	247d1f2190	fpu/softfloat: Make is_nan et al available to softfloat-specialize.h We will need these helpers within softfloat-specialize.h, so move the definitions above the include. After specialization, they will not always be used so mark them to avoid the Werror. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:27:15 -07:00
Alex Bennée	6fed16b265	fpu/softfloat: re-factor float to float conversions This allows us to delete a lot of additional boilerplate code which is no longer needed. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:27:15 -07:00
Alex Bennée	ca3a3d5a31	fpu/softfloat: Partial support for ARM Alternative half-precision For float16 ARM supports an alternative half-precision format which sacrifices the ability to represent NaN/Inf in return for a higher dynamic range. The new FloatFmt flag, arm_althp, is then used to modify the behaviour of canonicalize and round_canonical with respect to representation and exception raising. Usage of this new flag waits until we re-factor float-to-float conversions. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:27:15 -07:00
Richard Henderson	0bcfbcbea5	fpu/softfloat: Replace float_class_msnan with parts_silence_nan With a canonical representation of NaNs, we can silence an SNaN immediately rather than delay until the final format is known. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:24:19 -07:00
Richard Henderson	f7e598e264	fpu/softfloat: Replace float_class_dnan with parts_default_nan With a canonical representation of NaNs, we can return the default nan directly rather than delay the expansion until the final format is known. Note one case where we uselessly assigned to a.sign, which was overwritten/ignored later when expanding float_class_dnan. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:24:19 -07:00
Richard Henderson	298b468e43	fpu/softfloat: Introduce parts_is_snan_frac Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:24:19 -07:00
Richard Henderson	94933df0e5	fpu/softfloat: Canonicalize NaN fraction Shift the NaN fraction to a canonical position, much like we do for the fraction of normal numbers. This will facilitate manipulation of NaNs within the shared code paths. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:24:19 -07:00
Richard Henderson	0664335a6e	fpu/softfloat: Move softfloat-specialize.h below FloatParts definition We want to be able to specialize on the canonical representation. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:24:19 -07:00
Petr Tesarik	6603d50648	fpu/softfloat: Fix conversion from uint64 to float128 The significand is passed to normalizeRoundAndPackFloat128() as high first, low second. The current code passes the integer first, so the result is incorrectly shifted left by 64 bits. This bug affects the emulation of s390x instruction CXLGBR (convert from logical 64-bit binary-integer operand to extended BFP result). Cc: qemu-stable@nongnu.org Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Petr Tesarik <ptesarik@suse.com> Message-Id: <20180511071052.1443-1-ptesarik@suse.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-05-17 15:24:19 -07:00
Peter Maydell	333583757c	fpu/softfloat: Don't set Invalid for float-to-int(MAXINT) In float-to-integer conversion, if the floating point input converts exactly to the largest or smallest integer that fits in to the result type, this is not an overflow. In this situation we were producing the correct result value, but were incorrectly setting the Invalid flag. For example for Arm A64, "FCVTAS w0, d0" on an input of 0x41dfffffffc00000 should produce 0x7fffffff and set no flags. Fix the boundary case to take the right half of the if() statements. This fixes a regression from 2.11 introduced by the softfloat refactoring. Cc: qemu-stable@nongnu.org Fixes: `ab52f973a5` Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180510140141.12120-1-peter.maydell@linaro.org	2018-05-15 14:58:42 +01:00
Alex Bennée	a5a5f5e2e4	fpu/softfloat: int_to_float ensure r fully initialised Reported by Coverity (CID1390635). We ensure this for uint_to_float later on so we might as well mirror that. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-05-15 14:58:42 +01:00
Peter Maydell	1839189bbf	softfloat: Handle default NaN mode after pickNaNMulAdd, not before It is implementation defined whether a multiply-add of (0,inf,qnan) or (inf,0,qnan) raises InvalidaOperation or not, so we let the target-specific pickNaNMulAdd function handle this. This means that we must do the "return the default NaN in default NaN mode" check after the call, not before. Correct the ordering, and restore the comment from the old propagateFloat64MulAddNaN() that warned about this corner case. This fixes a regression from 2.11 for Arm guests where we would incorrectly fail to set the Invalid flag for these cases. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180504100547.14621-1-peter.maydell@linaro.org	2018-05-10 18:10:56 +01:00
Richard Henderson	ce8d408205	fpu: Bound increment for scalbn Without bounding the increment, we can overflow exp either here in scalbn_decomposed or when adding the bias in round_canonical. This can result in e.g. underflowing to 0 instead of overflowing to infinity. The old softfloat code did bound the increment. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-04-17 14:52:38 +01:00
Alex Bennée	9cb4e398c2	fpu/softfloat: check for Inf / x or 0 / x before /0 The re-factoring of div_floats changed the order of checking meaning an operation like -inf/0 erroneously raises the divbyzero flag. IEEE-754 (2008) specifies this should only occur for operations on finite operands. We fix this by moving the check on the dividend being Inf/0 to before the divisor is zero check. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180416135442.30606-1-alex.bennee@linaro.org Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-04-16 18:40:48 +01:00
Alex Bennée	801bc56336	fpu/softfloat: raise float_invalid for NaN/Inf in round_to_int_and_pack The re-factor broke the raising of INVALID when NaN/Inf is passed to the float_to_int conversion functions. round_to_uint_and_pack got this right for NaN but also missed out the Inf handling. Fixes https://bugs.launchpad.net/qemu/+bug/1759264 Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20180413140334.26622-3-alex.bennee@linaro.org Cc: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-04-16 10:10:31 +01:00
Emilio G. Cota	6245327a36	softfloat: fix {min, max}nummag for same-abs-value inputs Before `8936006` ("fpu/softfloat: re-factor minmax", 2018-02-21), we used to return +Zero for maxnummag(-Zero,+Zero); after that commit, we return -Zero. Fix it by making {min,max}nummag consistent with {min,max}num, deferring to the latter when the absolute value of the operands is the same. With this fix we now pass fp-test. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180413140334.26622-2-alex.bennee@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-04-13 15:47:53 +01:00
Richard Henderson	bd49e6027c	fpu: Fix rounding mode for floatN_to_uintM_round_to_zero We incorrectly passed in the current rounding mode instead of float_round_to_zero. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180410055912.934-1-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-04-10 13:02:26 +01:00
Stef O'Rear	cffad426f5	softfloat: fix crash on int conversion of SNaN Signed-off-by: Stef O'Rear <sorear2@gmail.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-03-09 14:30:12 +00:00
Laurent Vivier	0f605c889c	softfloat: use floatx80_infinity in softfloat Since `f3218a8` ("softfloat: add floatx80 constants") floatx80_infinity is defined but never used. This patch updates floatx80 functions to use this definition. This allows to define a different default Infinity value on m68k: the m68k FPU defines infinity with all bits set to zero in the mantissa. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20180224201802.911-4-laurent@vivier.eu>	2018-03-04 17:27:35 +01:00
Laurent Vivier	88857aca93	softfloat: export some functions Move fpu/softfloat-macros.h to include/fpu/ Export floatx80 functions to be used by target floatx80 specific implementations. Exports: propagateFloatx80NaN(), extractFloatx80Frac(), extractFloatx80Exp(), extractFloatx80Sign(), normalizeFloatx80Subnormal(), packFloatx80(), roundAndPackFloatx80(), normalizeRoundAndPackFloatx80() Also exports packFloat32() that will be used to implement m68k fsinh, fcos, fsin, ftan operations. Signed-off-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20180224201802.911-2-laurent@vivier.eu>	2018-03-04 17:22:55 +01:00
Alex Bennée	c13bb2da9e	fpu/softfloat: re-factor sqrt This is a little bit of a departure from softfloat's original approach as we skip the estimate step in favour of a straight iteration. There is a minor optimisation to avoid calculating more bits of precision than we need however this still brings a performance drop, especially for float64 operations. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-21 10:21:54 +00:00
Alex Bennée	0c4c909291	fpu/softfloat: re-factor compare The compare function was already expanded from a macro. I keep the macro expansion but move most of the logic into a compare_decomposed. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-21 10:21:47 +00:00
Alex Bennée	8936006707	fpu/softfloat: re-factor minmax Let's do the same re-factor treatment for minmax functions. I still use the MACRO trick to expand but now all the checking code is common. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-21 10:21:41 +00:00
Alex Bennée	0bfc9f1952	fpu/softfloat: re-factor scalbn This is one of the simpler manipulations you could make to a floating point number. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-21 10:21:34 +00:00
Alex Bennée	c02e1fb80b	fpu/softfloat: re-factor int/uint to float These are considerably simpler as the lower order integers can just use the higher order conversion function. As the decomposed fractional part is a full 64 bit rounding and inexact handling comes from the pack functions. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-21 10:21:29 +00:00
Alex Bennée	ab52f973a5	fpu/softfloat: re-factor float to int/uint We share the common int64/uint64_pack_decomposed function across all the helpers and simply limit the final result depending on the final size. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2018-02-21 10:21:22 +00:00
Alex Bennée	dbe4d53a59	fpu/softfloat: re-factor round_to_int We can now add float16_round_to_int and use the common round_decomposed and canonicalize functions to have a single implementation for float16/32/64 round_to_int functions. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>	2018-02-21 10:21:16 +00:00
Alex Bennée	d446830a3a	fpu/softfloat: re-factor muladd We can now add float16_muladd and use the common decompose and canonicalize functions to have a single implementation for float16/32/64 muladd functions. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>	2018-02-21 10:21:11 +00:00
Alex Bennée	cf07323d49	fpu/softfloat: re-factor div We can now add float16_div and use the common decompose and canonicalize functions to have a single implementation for float16/32/64 versions. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>	2018-02-21 10:21:06 +00:00
Alex Bennée	74d707e2cc	fpu/softfloat: re-factor mul We can now add float16_mul and use the common decompose and canonicalize functions to have a single implementation for float16/32/64 versions. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>	2018-02-21 10:20:59 +00:00
Alex Bennée	6fff216769	fpu/softfloat: re-factor add/sub We can now add float16_add/sub and use the common decompose and canonicalize functions to have a single implementation for float16/32/64 add and sub functions. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2018-02-21 10:20:53 +00:00

1 2 3 4

152 Commits