Commit Graph

182 Commits

Author SHA1 Message Date
Yann Collet
29ff35da80 Improved help message
following suggestions from #345.

Note : only cosmetic changes in this diff,
I've avoided any change that would impact the behavior of the program.
2020-04-07 15:19:08 -07:00
Yann Collet
fa1368ad7a
Merge pull request #346 from easyaspi314/unicode_fix_3
Fix Unicode hopefully for the last time
2020-04-07 14:18:10 -07:00
James Z.M. Gao
4b12376862 add optimized implementation for AVX512 targets 2020-04-06 15:58:26 +08:00
easyaspi314 (Devin)
ab2862f867 Fix Unicode hopefully for the last time
Use `__wgetmainargs()` instead of `GetCommandLineW()` and
`CommandLineToArgvW()`. While it is an internal API, aside from
declaring the function, it actually simplifies things a lot, as we can
setup a `wmain()` equivalent in a few lines.

It also avoids linking to `Shell32.dll`, which is a lot of bloat for a
simple command line utility.

Most importantly, this supports wildcards on `cmd.exe`, fixing #341.

Also improved comments and did minor cleanup.
2020-03-28 15:40:51 -04:00
easyaspi314 (Devin)
7ea2940fd6 Fix a few warnings
- Trailing comma in enum in xxhsum.c
 - Extra semicolon
 - Non-constant brace list initializers
2020-03-16 12:29:49 -04:00
easyaspi314 (Devin)
7328e217af Re-wrap comments.
Helps when you set your guideline to the correct column.
2020-03-10 14:12:15 -04:00
easyaspi314 (Devin)
a532573cf9 New Windows Unicode solution, works on XP
Now uses WriteConsoleW instead of vfwprintf. This is codepage
independent and works on Windows XP.

Doing this also avoids the _O_U8TEXT hack.
2020-03-10 14:06:09 -04:00
easyaspi314 (Devin)
962813dc7d Use better wording for short hash name
XXH3_64b seeded -> XXH3_64b w/seed
XXH3_64b secret -> XXH3_64b w/secret
2020-03-09 21:26:25 -04:00
easyaspi314 (Devin)
e30c5cc908 xxhsum.c: Remove some unused macros
MEM_MODULE and SET_SPARSE_FILE_MODE aren't even used, remove them.
2020-03-09 20:03:15 -04:00
easyaspi314 (Devin)
3e567ea317 Add withSecret variants to the benchmark
Moved sanityBuffer generator to its own function and made a static
buffer to use as a secret. Easier than passing a pointer around.
2020-03-09 19:44:04 -04:00
easyaspi314 (Devin)
a5e21d1da6 Abstract xxhsum bench, increase line length
Uses a table and a loop to reduce copy/paste and allow easy testing of
other hash functions. Create a wrapper and insert it into
g_hashesToBench, and it will automatically be added to the benchmark.

The hash display line has been made longer to actually fit xxHash's
names instead of clipping them. It is also configurable.
2020-03-09 19:02:48 -04:00
Yann Collet
1dc959b74d
Merge pull request #332 from easyaspi314/documentation-v2
More documentation and cleanup
2020-03-09 15:16:40 -07:00
easyaspi314 (Devin)
de893872d5 Document Unicode behavior on XP and 7 2020-03-09 17:53:07 -04:00
Yann Collet
3f06265869
Merge pull request #330 from easyaspi314/align-malloc
[HOTFIX] Align malloc, use createState/freeState in sanity check
2020-03-08 23:28:38 -07:00
easyaspi314 (Devin)
c994f5c9ef Fix copyright years
- Replace '-present' with '-2020' (fixes #329)
 - Use correct format: Copyright (C) <year> <name of author>
 - Fix some obviously incorrect years from copy/paste i.e. avoid time travel
2020-03-08 21:28:43 -04:00
easyaspi314 (Devin)
6aa4b03e35 Use createState/freeState in the sanity check
A malloc alignment bug went unnoticed since all states were allocated on
the stack.

It is a more orthogonal usage and tests the allocation functions.
2020-03-08 20:48:56 -04:00
Yann Collet
ff5df558b7 changed xxhash.com links to https 2020-03-04 18:36:13 -08:00
Yann Collet
d685a4e3d6 updated xxhsum verif values
for xxh128
2020-03-04 12:32:18 -08:00
easyaspi314 (Devin)
e516a04f58 Use better wording on Windows wrapper comments
Remove useless "useless" comments (aside from main, that one has a
meaning) and elaborate on fprintf/fwprintf issue.
2020-03-03 22:08:12 -05:00
easyaspi314 (Devin)
4b7cc243e7 Merge branch 'dev' of https://github.com/Cyan4973/xxHash into unicode-fix-v2 2020-03-03 22:07:12 -05:00
easyaspi314 (Devin)
61861afb36 Switch stdout/stderr to UTF-8 on Windows
Uses an adapted version of the fwprintf wrapper suggested by t-mat,
and _setmode.

Also tries to work around classic MinGW not defining _O_U8TEXT.
2020-03-03 20:56:19 -05:00
easyaspi314 (Devin)
9ea4471de8 Merge branch 'dev' into typo-hunt-ch2 2020-03-03 19:06:37 -05:00
easyaspi314 (Devin)
87e7d8b999 More typos, add some more documentation
- Remove most remaining spaces before punctuation
 - Fix a few missed copyright messages
 - Document the timer resolution workaround
 - Document XXH_mult32to64
   - I compiled GCC 3.2 and 4.2 just to test this, both are affected.
   - Make sure we downcast for __emulu
 - Other minor fixes
2020-03-03 12:10:19 -05:00
easyaspi314 (Devin)
9d91cd0836 Fix typo in feature test macros.
The typo hunt continues, this time it is my fault.

_M_IX86, _M_X64. Why it isn't _M_IX64, I have no idea.
2020-03-02 23:25:38 -05:00
Yann Collet
5c01134499 updated checksum verification values 2020-03-02 17:04:11 -08:00
easyaspi314 (Devin)
6fdff4b423 Fix copyright message formatting, update copyright years
- Add missing copyright for generate_unicode_test.c
   - use my real name, whatever
 - Fix blatantly incorrect copyright years
 - Update copyright years in xxhash.c/xxhash.h
 - Fix formatting
 - More typo fixes in multiInclude.c
2020-03-02 15:52:43 -05:00
easyaspi314 (Devin)
9eb91a3b53 Let the Great Typo Hunt commence!
Work in progress.

 - Fix many spelling/grammar issues, primarily in comments
 - Remove most spaces before punctuation
 - Update XXH3 comment
 - Wrap most comments to 80 columns
 - Unify most comments to use the same style
 - Use hexadecimal in the xxhash spec
 - Update help messages to better match POSIX/GNU conventions
 - Use HTML escapes in README.md to avoid UTF-8
 - Mark outdated benchmark/scores
2020-03-02 15:20:49 -05:00
Yann Collet
6f82266eca restored checksum validation for xxh3 & xxh128
triggered when starting `xxhsum` in benchmark mode.
2020-02-29 19:38:06 -08:00
Yann Collet
d63a8b3e16 fix minor warning (unused function)
will be used once checksum verification is re-enabled
2020-02-28 16:02:19 -08:00
Yann Collet
4e88e37d21 disabled checksum tests
since all values have changed.

To be re-enabled later.
2020-02-28 15:45:55 -08:00
easyaspi314 (Devin)
3b12ce3102 Improve architecture detection
- Properly detect MSVC x86
 - Make ARM more detailed - there are so many variants that change
   xxHash's performance that "arm" or "arm + NEON" is not specific
   enough.
 - Make sure x86_64 and aarch64 show SSE2 and NEON respectively.
2020-02-28 13:39:27 -05:00
Yann Collet
64f655a28e
Merge pull request #304 from easyaspi314/unicode-windows-fixes
Fix Unicode support on Windows, minor Windows tweaks
2020-02-24 09:55:47 -08:00
Yann Collet
741400a5b1 disabled checksum validation
while formula is in flux
2020-02-19 16:08:36 -08:00
easyaspi314 (Devin)
0197a2b5b0 Improve comments for Windows Unicode wrappers. 2020-02-14 20:48:12 -05:00
Yann Collet
993dcf89f7 fixed xxhsum verification values (partial) 2020-02-13 21:53:35 -08:00
easyaspi314 (Devin)
9bd98b0b45 Fix errors on older MinGW and MSVC
Always use wmain on MSVC, and use _wfopen instead of _wfopen_s.
2020-02-13 18:48:25 -05:00
easyaspi314 (Devin)
261c28b676 Fix Unicode support on Windows, minor Windows tweaks
- Unicode filenames should now work, with a method that works with
and without Unicode mode on Windows.
   - Added a test in the Makefile
 - Use unbuffered stderr output on Windows, fixes output not updating
immediately on MinGW.
 - Fix some missing $(EXT)s in the Makefile, causing Clang to emit
xxhsum instead of xxhsum.exe on Windows, as well as xxhsum's rule
ignoring $(FLAGS).
2020-02-12 20:37:34 -05:00
Yann Collet
1a67ed4437 return non-zero on empty string :
answering : https://github.com/Cyan4973/xxHash/issues/175#issuecomment-548108921

The probability of receiving an empty string is larger than random (> 1 / 2^64),
making the generated hash more "common".

For some algorithm, it's an issue if this "more common" value is 0.

Maps it instead to an avalanche of an arbitrary start value (prime64).
The start value is blended with the `seed` and the `secret`,
so that the result is dependent on those ones too.
2020-02-12 15:22:13 -08:00
Yann Collet
dadf1ef766 fix xxhash.h include from xxh3.h 2020-02-12 14:36:11 -08:00
Yann Collet
160e37d349 Merge branch 'dev' into s390x 2020-02-12 12:12:46 -08:00
Yann Collet
6a4e870843 removed cygwin code path
it's not useful :
cygwin uses the posix code path

fix #100
2020-02-10 16:37:04 -08:00
Yann Collet
7d4c33a025 fixed a bunch of cppcheck minor warnings
not all, as some are plain false positives with no obvious replacement.
2019-12-27 16:17:33 -08:00
easyaspi314 (Devin)
6aa9beeb50 [s390x] Identify s390x in xxhsum 2019-12-14 16:30:22 -05:00
Yann Collet
ae245428b8 update internal benchmark
to reduce risks of rounding bias
when a measurement uses a too small amount of time.
2019-12-10 13:12:28 -08:00
Yann Collet
e6dc2443f0 fix: multiple include with/without XXH_STATIC_LINKING 2019-10-11 07:50:06 -07:00
Yann Collet
4e4570f751 removed non-error messages from stderr when specifying -q 2019-10-07 08:25:57 -07:00
Yann Collet
73e6c5206c simplified type declaration
some types were not needed.

Also : xxh_u* type are only necessary within libxxhash,
not xxhsum
2019-10-07 07:52:32 -07:00
easyaspi314 (Devin)
368a6f9699 Improve typedefs, fix 16-bit int/seed type bug
Fixes #258.

```c
BYTE -> xxh_u8
U32  -> xxh_u32
U64  -> xxh_u64
```

Additionally, I hopefully fixed an issue for targets where int is 16
bits. XXH32 used unsigned int for its seed, and in C90 mode, unsigned
int as its U32. This would cause truncation issues. I check limits.h in
C90 mode to make sure UINT_MAX == 0xFFFFFFFFUL, and if it isn't, use
unsigned long.

We should see if we can set up an AVR CI test. Just to run the
verification program, though, as the benchmark will take a very long
time.

Lastly, the seed types are XXH32_hash_t and XXH64_hash_t for XXH32/64.
This matches xxhash.c and prevents the aforementioned 16-bit int bug.
2019-10-06 19:14:12 -04:00
easyaspi314 (Devin)
91d6e4927e Use both seeded and unseeded variants in the bench
Previously, XXH3_64bits looked much faster than XXH3_128bits. The truth
is that they are similar in long keys. The difference was that
XXH3_64b's benchmark was unseeded, putting it at an unfair advantage
over XXH128 which is seeded.

I don't think I am going to do the dummy bench. That made things moe
complicated.
2019-10-01 23:23:55 -04:00
Yann Collet
c8f3fb514c factorized mix32B
changing xxh128 results for len within 129-240.
2019-09-30 22:36:07 -07:00