xxHash

mirror of https://github.com/FEX-Emu/xxHash.git synced 2024-11-24 06:59:40 +00:00

Author	SHA1	Message	Date
easyaspi314 (Devin)	06d13f72b5	Document 1to3 input setup	2020-02-27 10:17:08 -05:00
easyaspi314 (Devin)	9d278c565a	Document the shift in XXH3_len_4to8_128b	2020-02-27 10:13:59 -05:00
easyaspi314 (Devin)	9d375060c8	Document 128-bit ops on XXH3_len_9to16_128b	2020-02-27 10:12:12 -05:00
Yann Collet	b54708f05b	xxh128 len[4-8] : minor change it's not useful to swap input segments the differentiation from seed is already taken care of by the seed itself and keeping number in the low bits slightly improves dispersion. Also may improve speed for specific case len=8 (constant)	2020-02-26 19:32:01 -08:00
Yann Collet	48933f0037	Merge pull request #311 from Cyan4973/xxh128 Simplify len [4,8] for xxh128	2020-02-26 18:13:37 -08:00
Yann Collet	5543c3dbe9	xxh128 len[4-8]: improved distribution quality	2020-02-26 16:07:18 -08:00
Yann Collet	4ca5b6e20e	xxh128 : len [4-8]: shift len by << 2 to preserve oddness of multiplier, as suggested by @easyaspi314. Also : stats from << 2 look better than << 1	2020-02-26 14:50:28 -08:00
Yann Collet	c6013d80d9	xxh128: slight optimization for len [4,8]	2020-02-24 18:14:32 -08:00
Yann Collet	935f280a76	xxh128: speedup len [4,8]	2020-02-24 17:39:33 -08:00
Yann Collet	b0104d2a82	Merge branch 'dev' into xxh128	2020-02-24 16:16:10 -08:00
Yann Collet	eba72be9fe	fixed prng using seed	2020-02-24 16:10:20 -08:00
Yann Collet	3a0c1c3336	fixed PerlinNoise test	2020-02-24 12:25:23 -08:00
Yann Collet	64f655a28e	Merge pull request #304 from easyaspi314/unicode-windows-fixes Fix Unicode support on Windows, minor Windows tweaks	2020-02-24 09:55:47 -08:00
Yann Collet	71f0f6ffd3	Merge pull request #308 from Cyan4973/mul32len8test Last variant for the 4to8 segment (mul32to64)	2020-02-24 09:52:33 -08:00
Yann Collet	8d80010b7b	fixed seed space reduction thanks to @easyaspi314	2020-02-22 10:29:04 -08:00
Yann Collet	ee460fdbbb	minor variation passing the PRNG test	2020-02-21 10:46:57 -08:00
Yann Collet	c8c4cc0f81	Merge pull request #309 from easyaspi314/compiler-specific-fixes Compiler specific fixes	2020-02-21 10:14:59 -08:00
easyaspi314 (Devin)	5309e282ce	Force -O2 on GCC + AVX2, document split load GCC for AVX2 goes overboard on the unrolling with -O3, causing slower code than MSVC and Clang. We can override that with a pragma that forces GCC to use -O2 instead. Note that GCC still generates the best scalar and SSE2 code with -O3. I also mentioned the fact that GCC will split _mm256_loadu_si256 into two instructions on a generic+avx2 target (which is an optimization that only applies to the non-AVX2 Sandy and Ivy Bridge chips), and provide the recommended flags.	2020-02-21 10:11:14 -05:00
easyaspi314 (Devin)	777ec6529a	Implement alternative byteshift load XXH_FORCE_MEMORY_ACCESS==3 will use a byteshift operation. This is preferred on older compilers which don't inline `memcpy()` or some big-endian systems without a native byteswap. Also fix a small typo.	2020-02-21 10:01:04 -05:00
easyaspi314 (Devin)	558c9a97bf	XXH_mult32to64: Use downcast+upcast instead of mask Old/stupid compilers may generate an erroneous mask in XXH_mult32to64, e.g. ARM GCC 2.95: ```c xxh_u64 XXH_mult32to64(xxh_u64 a, xxh_u64 b) { return (a & 0xffffffff) * (b & 0xffffffff); } ``` `arm-gcc-2.95 -O3 -S -march=armv4t -mcpu=arm7tdmi -fomit-frame-pointer` ```asm XXH_mult32to64: push {r4, r5, r6, r7, lr} mov r5, #0 mov r4, #0xffffffff mov r7, r5 mov r6, r4 @ mask 32-bit registers by 0x00000000 and 0xffffffff ?!?!?! and r6, r6, r0 and r7, r7, r1 and r4, r4, r2 and r5, r5, r3 @ full 64x64->64 multiply umull r0, r1, r6, r4 mla r1, r6, r5, r1 mla r1, r4, r7, r1 pop {r4, r5, r6, r7, pc} ``` Meanwhile, using a downcast followed by an upcast generates the expected code, albeit with some understandable regalloc weirdness (ARM support was only recently added). ```c xxh_u64 XXH_mult32to64(xxh_u64 a, xxh_u64 b) { return (xxh_u64)(xxh_u32)a * (xxh_u64)(xxh_u32)b; } ``` `arm-gcc-2.95 -O3 -S -march=armv4t -mcpu=arm7tdmi -fomit-frame-pointer` ```asm XXH_mult32to64: push {r4, lr} umull r3, r4, r0, r2 mov r1, r4 mov r0, r3 pop {r4, pc} ``` Switching to this implementation may also remove the requirement for `__emulu` on MSVC x86, but it hasn't been tested yet. All modern compilers should recognize both patterns, but it seems that old 32-bit compilers will prefer the latter, making this a free optimization.	2020-02-21 06:09:06 -05:00
easyaspi314 (Devin)	77b74c9dc9	Put __attribute__((aligned)) after struct member Improves compatibility with old GCC versions.	2020-02-21 06:05:55 -05:00
Yann Collet	00d5458761	minor optimization slightly faster for ARM and x86	2020-02-20 15:15:56 -08:00
Yann Collet	6c3f96a9cc	minor simplification	2020-02-20 09:00:32 -08:00
Yann Collet	de90226410	try a variation of len_4to8 using mult32to64 adding swap instructions to better dispatch bits.	2020-02-20 00:29:12 -08:00
Yann Collet	5fad59e746	make len8 part of 8to16	2020-02-19 21:02:00 -08:00
Yann Collet	77d65ff45f	fixed Perlin Noise test	2020-02-19 17:45:21 -08:00
Yann Collet	741400a5b1	disabled checksum validation while formula is in flux	2020-02-19 16:08:36 -08:00
Yann Collet	f1051cda49	joined len==8 into 4to8	2020-02-19 16:01:20 -08:00
Yann Collet	123db71fd0	improvement vs mul0	2020-02-19 15:58:19 -08:00
Yann Collet	f486b3c7c4	try a mul32to64 formula for len_4to8	2020-02-19 15:45:43 -08:00
Yann Collet	6456c04490	added likely removed bijectivity	2020-02-19 14:41:56 -08:00
easyaspi314 (Devin)	0197a2b5b0	Improve comments for Windows Unicode wrappers.	2020-02-14 20:48:12 -05:00
easyaspi314 (Devin)	f0627bc321	Add explicit rules for object files to include FLAGS This fixes the Clang appveyor build. Now, FLAGS will always be applied to the object files and linker files.	2020-02-14 19:44:30 -05:00
easyaspi314 (Devin)	b5cef6dce0	Fix accidental typo Don't know how I did that one.	2020-02-14 19:14:08 -05:00
easyaspi314 (Devin)	cac3ca4d5d	Implement a safer Unicode test This new test doesn't use any Unicode in the source files, instead encoding all UTF-8 and UTF-16 as hex. The test script will be generated from a C file, in which both a shell script and a batch script will be generated, as well as the Unicode file to test. On Cygwin, MinGW, and MSYS, we will automatically bail from the shell script to the batch script, as cmd.exe has more reliable Unicode support, at least on Windows 7 and later. When the make rule is called, it first checks if `$LANG` contains UTF-8, defining the (overridable) ENABLE_UNICODE flag. If so, it will skip the test with a warning. Also fixed an issue with printf in multiInclude.c causing warnings on old MinGW versions which expect %I64, and updated the .gitignore.	2020-02-14 19:08:09 -05:00
Yann Collet	993dcf89f7	fixed xxhsum verification values (partial)	2020-02-13 21:53:35 -08:00
Yann Collet	9df2729931	Merge branch 'dev' into smallInputs	2020-02-13 21:31:16 -08:00
Yann Collet	0a5f34f8cf	modified small inputs for xxh3 in order to pass the new Perlin_noise test. Sizes 4-8 should also be slightly faster.	2020-02-13 19:14:58 -08:00
easyaspi314 (Devin)	9bd98b0b45	Fix errors on older MinGW and MSVC Always use wmain on MSVC, and use _wfopen instead of _wfopen_s.	2020-02-13 18:48:25 -05:00
easyaspi314 (Devin)	dbe2addcc1	Move test-unicode to test-all. There are some theoretical systems which don't handle Unicode well, and test is designed to be pretty much universal. This locks it behind test-all.	2020-02-12 20:58:10 -05:00
easyaspi314 (Devin)	e460437a9d	Fix typo	2020-02-12 20:54:13 -05:00
easyaspi314 (Devin)	3593758487	Fix minor typo	2020-02-12 20:46:50 -05:00
easyaspi314 (Devin)	261c28b676	Fix Unicode support on Windows, minor Windows tweaks - Unicode filenames should now work, with a method that works with and without Unicode mode on Windows. - Added a test in the Makefile - Use unbuffered stderr output on Windows, fixes output not updating immediately on MinGW. - Fix some missing $(EXT)s in the Makefile, causing Clang to emit xxhsum instead of xxhsum.exe on Windows, as well as xxhsum's rule ignoring $(FLAGS).	2020-02-12 20:37:34 -05:00
Yann Collet	16f6cee1bf	Merge pull request #303 from Cyan4973/nullstring return non-zero on empty string :	2020-02-12 16:45:03 -08:00
Yann Collet	fa0a6ebc7f	fixed emptry-string results on Big-Endian	2020-02-12 15:32:45 -08:00
Yann Collet	1a67ed4437	return non-zero on empty string : answering : https://github.com/Cyan4973/xxHash/issues/175#issuecomment-548108921 The probability of receiving an empty string is larger than random (> 1 / 2^64), making the generated hash more "common". For some algorithm, it's an issue if this "more common" value is 0. Maps it instead to an avalanche of an arbitrary start value (prime64). The start value is blended with the `seed` and the `secret`, so that the result is dependent on those ones too.	2020-02-12 15:22:13 -08:00
Yann Collet	aee51d5e7b	Merge pull request #302 from Cyan4973/inline_all xxhash can be inlined even when previously included	2020-02-12 15:01:41 -08:00
Yann Collet	dadf1ef766	fix xxhash.h include from xxh3.h	2020-02-12 14:36:11 -08:00
Yann Collet	b8d761e5a7	Merge branch 'dev' into inline_all	2020-02-12 14:19:13 -08:00
Yann Collet	99077eca58	Merge pull request #301 from Cyan4973/s390x S390x	2020-02-12 14:18:37 -08:00

... 2 3 4 5 6 ...

955 Commits