The Power of Hashing
Hashing transforms a large key into a small index from 0 to m-1 to store data in a fixed-size table. It aims for O(1) search time.
1. Hash Functions
The core mechanism h(k) that maps a key to an index.
- Division h(k) = k % m. Fast, but sensitive to m (prime is best).
- Multiplication h(k) = ⌊m(kA mod 1)⌋. Good for any m.
- Mid-Square Square k, take middle digits. Good distribution.
- Folding Split key into parts and sum them (e.g., for phone numbers).
2. Collision Resolution
When h(k1) = h(k2), a collision occurs. We must handle it.
- Linear Probing (+1, +2...)
- Quadratic Probing (+12, +22...)
- Double Hashing (use h2(k))
Can handle Load Factor α > 1.
3. Universal Hashing
To prevent attacks where an adversary chooses keys that all hash to the same slot, we can pick a hash function randomly at runtime from a carefully designed family of functions (H).
This makes the probability of collision low for any set of keys.
4. Perfect Hashing
For a static set of keys (that don't change), it's possible to create a hash function that guarantees zero collisions.
This is often done with a two-level hashing scheme, where a universal hash function is used at each level.
Advanced Hashing Techniques
Exploring mathematical transformations to map keys uniformly.
Division
h(k) = k % m. Fast but sensitive to patterns.
BasicMultiplication
h(k) = ⌊m(kA mod 1)⌋. A ≈ 0.618. Good distribution.
RobustMid-Square
Square k, take middle r digits. Mixes all bits.
StatisticalFolding
Split key into parts, sum them up. Good for long numbers.
Strings/BigIntUniversal
Pick random h from a family H at runtime. Prevents attacks.
Security