Skip to content

Provides and compares C# implementations of non-cryptographic Hashes

License

Notifications You must be signed in to change notification settings

gimpf/Haschisch.Kastriert

Repository files navigation

Haschisch: A .NET Library for Non-Cryptographic Hashes

Haschisch provides several non-cryptographic hash algorithms for .NET, featuring a common API and high-performance implementations.

Getting Started

  1. .NET Core 2 SDK required
  2. build solution:dotnet publish -c Release
  3. run benchmarks: dotnet bin/Release/netcoreapp2.0/publish/Haschisch.Benchmarks.dll -j:core_x64 -- CombineHashCodes --allcategories=prime,per-ad-seed

To use one of hashers:

int HashWithBlock<T>(byte[] data) where T : struct, IBlockHasher<int> =>
    default(T).Hash(data, 0, data.Length);
// ...
HashWithBlock<XXHash64Hasher.Block>(new byte[0]);

Current State

Available algorithms:

  • XXHash: xx32, xx64
  • CityHash: city32, city64, city64 w/seeds,
  • HalfSip: hsip13, hsip24
  • Sip: sip13, sip24
  • Ferner liefen: seahash, marvin32, spookyv2.

Most algorithms are available for these interfaces:

  • IHashCodeCombiner: Combine the hash-codes of .NET values into a new hash-code, that is, precursors and alternatives to the new System.HashCode type.
  • IStreamingHasher<int>: Incremental/Streaming hash calculation, following the well-known init-mix-mix-finish interface. city-type hashes don't support this.
  • IBlockHasher<int>: Calculate the hash of a contiguous block of memory.
  • IUnsafeBlockHasher<int>: as above, but working of ref byte instead of an array, obviously unsafe.

The size of the hash-codes depend on the algorithm, but hash-codes shortened to 32bit are available as a common API.

Performance remarks w.r.t. to algorithm selection:

  • XXHash, Murmur-3-32, Sip and Half-Sip perform well, that is not too much slower than reference C implementations.
  • For combining hash-codes, the Combiner-type for XXHash32 is the fastest accross both 64- and 32-bit .NET targets. Sip-1-3 is doing well on 64 bit.
  • For large-ish messages (larger than 2 kiB) the Block implementation of xxHash64 seems acceptably fast (on 64bit targets).
  • CityHash, SpookyHash and SeaHash are usable, but are slower than expected compared to XX etc.

Performance remarks for all algorithms:

  • The native implementations are always faster than anything here.
  • All implementations are non-allocating, that is there are no hidden penalties because of GC pressure.
  • Only RyuJit leads to usable performance. Running this on old runtimes will strongly disappoint you.
  • Some Stream-type implementations support block updates with the unsafe API. They should be good enough for file-checksumming etc. Again, xxHash64 works acceptably well.
  • Using Stream-type implementations with small updates leads to disastrous performance.

Changelog

  • WIP: introduce new hash-algorithms (sip13, sip24, city32, city64, city64-w-seeds, spookyv2), and optimize existing and new ones, especially for hash-code-combining; also extend the test-coverage, and simplify benchmarking
  • 0.3.0: introduce IHashCodeCombiner, extend benchmarks to compare performance for use-cases related to dotnet/corefx issue 'Add System.HashCode'.
  • 0.2.0: port to .NETStandard 2.0 and .NET Core 2.0
  • 0.1.0: first public version, having Block and Stream hashers for xx32, xx64, seahash, marvin32, hsip13, hsip24

License

Parts of the Marvin32 implementation are available under the MIT license (see Marvin32Steps.cs).

Everything else is available under CC0:

Haschisch by Markus Grüneis

To the extent possible under law, the person who associated CC0 with Haschisch has waived all copyright and related or neighboring rights to Haschisch.

You should have received a copy of the CC0 legalcode along with this work. If not, see http://creativecommons.org/publicdomain/zero/1.0/.

About

Provides and compares C# implementations of non-cryptographic Hashes

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages