38 Commits

Author SHA1 Message Date
Yann Collet
f2163b8b75 changed algorithm for small inputs
- seed modifies the key values,
  the value that can trigger a zero multiply is now seed-dependent

- accumulator also receive input as addition,
  cancelling the impact of zero multiply.

Performance on small inputs seems slightly slower, within noise measurement level.
2019-03-16 21:27:39 -07:00
Yann Collet
f3d4bf4eef xxh3: ensure all 64-bits of the seed are ingested
in the len=[1-3] case
2019-03-16 20:32:20 -07:00
Yann Collet
4229399fc9 fixed minor warnings on Visual 2019-03-16 09:35:10 -07:00
Yann Collet
b810177b0a added Visual target on Appveyor 2019-03-16 09:18:56 -07:00
Yann Collet
701423eeda fixed most Visual compilation issues
still this dllimport thing,
I don't know why it was added,
maybe something to remove altogether.
2019-03-16 06:59:46 -07:00
Yann Collet
40dbf78fa9 renamed XXH128_hash_t members to low64 and high64 2019-03-14 13:08:38 -07:00
easyaspi314 (Devin)
c1ae3287a1 Update ARM NEON code
The NEON algorithms have now been updated to match the SSE2 algorithm.
2019-03-12 22:20:45 -04:00
Yann Collet
8423e82ef8 fixed last integration issues 2019-03-12 18:13:46 -07:00
Yann Collet
af852ac752 fixed last strict aliasing issues 2019-03-12 17:48:59 -07:00
Yann Collet
e6433e8dfd restored clang #pragma unroll statement
that has been accidentally lost in an update.
2019-03-12 17:36:37 -07:00
Yann Collet
3fe53a4ab9 fixed endianess issue 2019-03-12 15:57:56 -07:00
Yann Collet
51ac7dc7e9 fixed minor conversion warning
detected on ARM 32-bit
2019-03-12 12:56:52 -07:00
Yann Collet
c76d96454b xxh3: fixed declaration after statement in AVX2 path
also :
- added header license
- fixed alignment declaration
2019-03-12 11:55:37 -07:00
Yann Collet
405e49403c xxh3: fixed scalar variant
scrambling stage wasn't updated to match new formula
2019-03-11 15:40:01 -07:00
Yann Collet
638993f16b added consistency tests for XXH3_64b
validated against SSE2 path
2019-03-11 15:09:27 -07:00
Yann Collet
2010b7e7de fixed addition discrepancy between scalar and vector code
let's both have a 64-bit addition with carry
2019-03-09 00:19:40 -05:00
Yann Collet
c92d6bcdd8
Merge pull request #172 from easyaspi314/xxh3-pi-test
Add unroll pragma for Clang in XXH3_accumulate.
2019-03-08 22:37:31 -05:00
Yann Collet
a5d5bf778f improve algorithm by compensating UMAC deficiency
no longer possibly to nullify one member through another
2019-03-08 22:32:11 -05:00
easyaspi314 (Devin)
60215c5bfb Fix typo causing build failure on 32-bit 2019-03-08 22:26:25 -05:00
easyaspi314 (Devin)
c5953f132c Add unroll pragma for Clang in XXH3_accumulate.
Clang doesn't unroll the XXH3_accumulate loop for some reason. Using
`#pragma clang loop unroll(enable)` to hint to Clang that it should
unroll results in a huge 1.4-1.5x speedup.

Before: 15 GB/s
After:  21 GB/s
2019-03-08 22:07:08 -05:00
Yann Collet
2afd24d8bb xxh128: minor modifications to improve bias
1.4% => 0.6%
2019-03-08 16:03:24 -05:00
Yann Collet
27b4f31b77 Merge branch 'xxh3_128' into xxh3 2019-03-08 15:54:41 -05:00
Yann Collet
4f4f63c73b modified xxh128 so that low part == xxh3_64b 2019-03-08 15:37:06 -05:00
easyaspi314 (Devin)
02d0ba79a0 Remove preprocessor statement leftover from testing
What '0 &&' ? No idea what you are talking about...
2019-03-07 19:51:39 -05:00
easyaspi314 (Devin)
7558f18493 Add improved 128-bit multiply routine for 32-bit and use intrinsics long multiply 2019-03-07 17:26:49 -05:00
Yann Collet
a951c0aeba xxh3: updated mul128 with a 32-bits backup path
also:
started XXH128 (not finished yet)
2019-03-06 23:42:04 -05:00
Yann Collet
8d96de3e1c added variant with seed 2019-03-06 17:46:42 -05:00
Yann Collet
48e3d724d1 updated xxh3 2019-03-06 11:55:48 -05:00
Yann Collet
1d32e2664a
Merge pull request #167 from easyaspi314/xxh3
xxh3: add NEON support
2019-02-28 17:40:33 -08:00
easyaspi314 (Devin)
8d345470e6 xxh3: add NEON support
Signed-off-by: easyaspi314 (Devin) <easyaspi314@users.noreply.github.com>
2019-02-28 20:28:29 -05:00
Yann Collet
b348fa896a restored 8-way mixer 2019-02-28 16:43:44 -08:00
Yann Collet
fa31d0b02f xxh3: fixed last minor quality metric
in extended tests
2019-02-27 15:03:23 -08:00
Yann Collet
5b827f538c improved 8-ways mixer 2019-02-26 18:38:20 -08:00
Yann Collet
2be95459cd fixed minor c90 warning 2019-02-26 16:42:50 -08:00
Yann Collet
7784d41ce3 fixed ARM compilation error 2019-02-26 16:36:03 -08:00
Yann Collet
94bebd5b86 xxh3: more c90 compatibility 2019-02-26 15:55:25 -08:00
Yann Collet
43c10239c9 minor C90 adaptation fixes
added -Wconversion flag
2019-02-26 13:45:56 -08:00
Yann Collet
45f39e6d34 first implementation of XXH3_64b
currently can only be used for benchmarking (`-b`)
2019-02-26 12:36:23 -08:00