gitmyhub

xxHash

C ★ 11k updated 13d ago

Extremely fast non-cryptographic hash algorithm

Ultra-fast hashing library in C that turns data into fingerprints at speeds faster than RAM, built for non-security uses like integrity checks, hash tables, and duplicate detection.

Csetup: easycomplexity 2/5

xxHash is a hashing library written in C. A hash function takes a piece of data, such as a file or a string of text, and produces a short fixed-length number called a hash. That number acts like a fingerprint: if the data changes even slightly, the hash changes too. Hashes are used constantly in software for things like quickly checking whether data has been corrupted, looking things up in tables, or detecting duplicate files.

What sets xxHash apart is speed. The README benchmarks show its fastest variant, XXH3, processing data at roughly 31 gigabytes per second on a modern desktop processor, which is faster than the rate at which that machine can read from RAM. Most well-known hash functions like MD5 or SHA1 are designed with security in mind and run far more slowly; xxHash is not a security tool and makes no claim to be, but for non-security uses (integrity checking, hash tables, caching) it is much faster.

The library offers several variants. XXH32 produces a 32-bit hash suited to 32-bit processors, XXH64 produces a 64-bit hash for 64-bit systems, and XXH3 (introduced in version 0.8) produces either 64-bit or 128-bit hashes and is optimized for modern processors using a technique called vectorized arithmetic, which processes multiple values at once. All variants pass an independent test suite called SMHasher that evaluates quality properties such as how evenly the output values are distributed.

The code is written in plain C, runs identically on processors with different byte orderings, and is available as either a single header file you drop into a project or a compiled library. It is free to use under a BSD-style license.

The README is fairly technical, covering benchmark numbers, build configuration options, and integration instructions. The core use case is simple: any software that needs to hash data quickly and does not need cryptographic security.

Where it fits