xxHash

mirror of https://github.com/FEX-Emu/xxHash.git synced 2024-11-30 10:11:22 +00:00

Author	SHA1	Message	Date
easyaspi314 (Devin)	e5883c4a33	Improve wording on XXH3_accumulate_512 comment Adds more details of the changes compared to original UMAC and reduces run-ons.	2020-02-27 18:48:16 -05:00
easyaspi314 (Devin)	f2a78adb2a	Be consistent with wording	2020-02-27 18:10:34 -05:00
easyaspi314 (Devin)	9363917d08	Swap good and bad in 64-bit subset comment, expand Flows better.	2020-02-27 18:07:48 -05:00
easyaspi314 (Devin)	2685b58257	Add a quick comment about the 64-bit arithmetic subset Added in the place for the Thumb sanity check because it made the most sense. Also noted that the requirements were not much more than XXH32.	2020-02-27 17:53:26 -05:00
easyaspi314 (Devin)	ce652cc503	Note the primary reused subroutines	2020-02-27 17:21:55 -05:00
easyaspi314 (Devin)	1acf797a85	Move 17to128 and 129to240 into a more logical location Now, it is sorted by length from short to long. Also, mention how mix32B is slower but better resists multiply by zero.	2020-02-27 15:43:18 -05:00
easyaspi314 (Devin)	fb999b1960	Fix yet another typo Thanks, @aras-p	2020-02-27 15:36:35 -05:00
easyaspi314 (Devin)	b655d2db93	Fix copy-paste issue	2020-02-27 14:43:30 -05:00
easyaspi314 (Devin)	39feadea2e	Fix minor typi	2020-02-27 14:40:07 -05:00
easyaspi314 (Devin)	a1ca4ff0e2	Comment on FARSH keys and Mum variant, indent fix	2020-02-27 14:22:45 -05:00
easyaspi314 (Devin)	0ffbf28843	Add some extra details, fix typo	2020-02-27 14:13:57 -05:00
easyaspi314 (Devin)	bba53920a5	Mention the seed-dependent collisions in mix16B We know it exists, don't hide it. It is highly unlikely to occur with proper seeding and random inputs, and it doesn't occur on the 128-bit version, so make sure people are aware of it.	2020-02-27 14:00:54 -05:00
easyaspi314 (Devin)	daee1fb34e	Document the short hash redo compared to XXH64.	2020-02-27 11:15:12 -05:00
easyaspi314 (Devin)	0767d9601d	Document accumulate_512 and scrambleAcc, rename vsx typedefs Comments are now synchronized across all SIMD implementations, and both now have a summary block comment. Additionally, VSX now uses xxh_u64x2 to match the scalar typedefs.	2020-02-27 10:52:22 -05:00
easyaspi314 (Devin)	06d13f72b5	Document 1to3 input setup	2020-02-27 10:17:08 -05:00
easyaspi314 (Devin)	9d278c565a	Document the shift in XXH3_len_4to8_128b	2020-02-27 10:13:59 -05:00
easyaspi314 (Devin)	9d375060c8	Document 128-bit ops on XXH3_len_9to16_128b	2020-02-27 10:12:12 -05:00
Yann Collet	b54708f05b	xxh128 len[4-8] : minor change it's not useful to swap input segments the differentiation from seed is already taken care of by the seed itself and keeping number in the low bits slightly improves dispersion. Also may improve speed for specific case len=8 (constant)	2020-02-26 19:32:01 -08:00
Yann Collet	48933f0037	Merge pull request #311 from Cyan4973/xxh128 Simplify len [4,8] for xxh128	2020-02-26 18:13:37 -08:00
Yann Collet	5543c3dbe9	xxh128 len[4-8]: improved distribution quality	2020-02-26 16:07:18 -08:00
Yann Collet	4ca5b6e20e	xxh128 : len [4-8]: shift len by << 2 to preserve oddness of multiplier, as suggested by @easyaspi314. Also : stats from << 2 look better than << 1	2020-02-26 14:50:28 -08:00
Yann Collet	c6013d80d9	xxh128: slight optimization for len [4,8]	2020-02-24 18:14:32 -08:00
Yann Collet	935f280a76	xxh128: speedup len [4,8]	2020-02-24 17:39:33 -08:00
Yann Collet	b0104d2a82	Merge branch 'dev' into xxh128	2020-02-24 16:16:10 -08:00
Yann Collet	eba72be9fe	fixed prng using seed	2020-02-24 16:10:20 -08:00
Yann Collet	3a0c1c3336	fixed PerlinNoise test	2020-02-24 12:25:23 -08:00
Yann Collet	64f655a28e	Merge pull request #304 from easyaspi314/unicode-windows-fixes Fix Unicode support on Windows, minor Windows tweaks	2020-02-24 09:55:47 -08:00
Yann Collet	71f0f6ffd3	Merge pull request #308 from Cyan4973/mul32len8test Last variant for the 4to8 segment (mul32to64)	2020-02-24 09:52:33 -08:00
Yann Collet	8d80010b7b	fixed seed space reduction thanks to @easyaspi314	2020-02-22 10:29:04 -08:00
Yann Collet	ee460fdbbb	minor variation passing the PRNG test	2020-02-21 10:46:57 -08:00
Yann Collet	c8c4cc0f81	Merge pull request #309 from easyaspi314/compiler-specific-fixes Compiler specific fixes	2020-02-21 10:14:59 -08:00
easyaspi314 (Devin)	5309e282ce	Force -O2 on GCC + AVX2, document split load GCC for AVX2 goes overboard on the unrolling with -O3, causing slower code than MSVC and Clang. We can override that with a pragma that forces GCC to use -O2 instead. Note that GCC still generates the best scalar and SSE2 code with -O3. I also mentioned the fact that GCC will split _mm256_loadu_si256 into two instructions on a generic+avx2 target (which is an optimization that only applies to the non-AVX2 Sandy and Ivy Bridge chips), and provide the recommended flags.	2020-02-21 10:11:14 -05:00
easyaspi314 (Devin)	777ec6529a	Implement alternative byteshift load XXH_FORCE_MEMORY_ACCESS==3 will use a byteshift operation. This is preferred on older compilers which don't inline `memcpy()` or some big-endian systems without a native byteswap. Also fix a small typo.	2020-02-21 10:01:04 -05:00
easyaspi314 (Devin)	558c9a97bf	XXH_mult32to64: Use downcast+upcast instead of mask Old/stupid compilers may generate an erroneous mask in XXH_mult32to64, e.g. ARM GCC 2.95: ```c xxh_u64 XXH_mult32to64(xxh_u64 a, xxh_u64 b) { return (a & 0xffffffff) * (b & 0xffffffff); } ``` `arm-gcc-2.95 -O3 -S -march=armv4t -mcpu=arm7tdmi -fomit-frame-pointer` ```asm XXH_mult32to64: push {r4, r5, r6, r7, lr} mov r5, #0 mov r4, #0xffffffff mov r7, r5 mov r6, r4 @ mask 32-bit registers by 0x00000000 and 0xffffffff ?!?!?! and r6, r6, r0 and r7, r7, r1 and r4, r4, r2 and r5, r5, r3 @ full 64x64->64 multiply umull r0, r1, r6, r4 mla r1, r6, r5, r1 mla r1, r4, r7, r1 pop {r4, r5, r6, r7, pc} ``` Meanwhile, using a downcast followed by an upcast generates the expected code, albeit with some understandable regalloc weirdness (ARM support was only recently added). ```c xxh_u64 XXH_mult32to64(xxh_u64 a, xxh_u64 b) { return (xxh_u64)(xxh_u32)a * (xxh_u64)(xxh_u32)b; } ``` `arm-gcc-2.95 -O3 -S -march=armv4t -mcpu=arm7tdmi -fomit-frame-pointer` ```asm XXH_mult32to64: push {r4, lr} umull r3, r4, r0, r2 mov r1, r4 mov r0, r3 pop {r4, pc} ``` Switching to this implementation may also remove the requirement for `__emulu` on MSVC x86, but it hasn't been tested yet. All modern compilers should recognize both patterns, but it seems that old 32-bit compilers will prefer the latter, making this a free optimization.	2020-02-21 06:09:06 -05:00
easyaspi314 (Devin)	77b74c9dc9	Put __attribute__((aligned)) after struct member Improves compatibility with old GCC versions.	2020-02-21 06:05:55 -05:00
Yann Collet	00d5458761	minor optimization slightly faster for ARM and x86	2020-02-20 15:15:56 -08:00
Yann Collet	6c3f96a9cc	minor simplification	2020-02-20 09:00:32 -08:00
Yann Collet	de90226410	try a variation of len_4to8 using mult32to64 adding swap instructions to better dispatch bits.	2020-02-20 00:29:12 -08:00
Yann Collet	5fad59e746	make len8 part of 8to16	2020-02-19 21:02:00 -08:00
Yann Collet	77d65ff45f	fixed Perlin Noise test	2020-02-19 17:45:21 -08:00
Yann Collet	741400a5b1	disabled checksum validation while formula is in flux	2020-02-19 16:08:36 -08:00
Yann Collet	f1051cda49	joined len==8 into 4to8	2020-02-19 16:01:20 -08:00
Yann Collet	123db71fd0	improvement vs mul0	2020-02-19 15:58:19 -08:00
Yann Collet	f486b3c7c4	try a mul32to64 formula for len_4to8	2020-02-19 15:45:43 -08:00
Yann Collet	6456c04490	added likely removed bijectivity	2020-02-19 14:41:56 -08:00
easyaspi314 (Devin)	0197a2b5b0	Improve comments for Windows Unicode wrappers.	2020-02-14 20:48:12 -05:00
easyaspi314 (Devin)	f0627bc321	Add explicit rules for object files to include FLAGS This fixes the Clang appveyor build. Now, FLAGS will always be applied to the object files and linker files.	2020-02-14 19:44:30 -05:00
easyaspi314 (Devin)	b5cef6dce0	Fix accidental typo Don't know how I did that one.	2020-02-14 19:14:08 -05:00
easyaspi314 (Devin)	cac3ca4d5d	Implement a safer Unicode test This new test doesn't use any Unicode in the source files, instead encoding all UTF-8 and UTF-16 as hex. The test script will be generated from a C file, in which both a shell script and a batch script will be generated, as well as the Unicode file to test. On Cygwin, MinGW, and MSYS, we will automatically bail from the shell script to the batch script, as cmd.exe has more reliable Unicode support, at least on Windows 7 and later. When the make rule is called, it first checks if `$LANG` contains UTF-8, defining the (overridable) ENABLE_UNICODE flag. If so, it will skip the test with a warning. Also fixed an issue with printf in multiInclude.c causing warnings on old MinGW versions which expect %I64, and updated the .gitignore.	2020-02-14 19:08:09 -05:00
Yann Collet	993dcf89f7	fixed xxhsum verification values (partial)	2020-02-13 21:53:35 -08:00

1 2 3 4 5 ...

819 Commits