Custom Hash Table Implementation
To truly understand how hash tables work, it’s useful to build a simple implementation from scratch. This section walks through a minimal hash table that supports insertion, lookup, and deletion using separate chaining.
This implementation prioritizes clarity over performance.
Design Goals
This educational hash table will:
-
Use separate chaining to handle collisions
-
Use a fixed-size bucket array
-
Support basic operations:
- Insert
- Get
- Remove
What it will not do:
- Resize dynamically
- Use a cryptographically secure hash
- Optimize for performance
High-Level Structure
Buckets (Vec)
┌─────┐
│ 0 │ → [(k1, v1)]
│ 1 │ → [(k2, v2), (k3, v3)]
│ 2 │ → []
│ 3 │ → [(k4, v4)]
└─────┘
Each bucket is a small list of key-value pairs.
Hash Function (Simplified)
For educational purposes, we’ll use Rust’s Hash trait combined with a modulo operation:
index = hash(key) % bucket_count
Rust Implementation
use std::collections::hash_map::DefaultHasher;
use std::hash::{Hash, Hasher};
const DEFAULT_BUCKETS: usize = 16;
#[derive(Debug)]
pub struct HashMap<K, V> {
buckets: Vec<Vec<(K, V)>>,
}
impl<K: Eq + Hash, V> HashMap<K, V> {
pub fn new() -> Self {
HashMap {
buckets: vec![Vec::new(); DEFAULT_BUCKETS],
}
}
fn bucket_index(&self, key: &K) -> usize {
let mut hasher = DefaultHasher::new();
key.hash(&mut hasher);
(hasher.finish() as usize) % self.buckets.len()
}
pub fn insert(&mut self, key: K, value: V) {
let index = self.bucket_index(&key);
let bucket = &mut self.buckets[index];
for &mut (ref existing_key, ref mut existing_value) in bucket.iter_mut() {
if existing_key == &key {
*existing_value = value;
return;
}
}
bucket.push((key, value));
}
pub fn get(&self, key: &K) -> Option<&V> {
let index = self.bucket_index(key);
self.buckets[index]
.iter()
.find(|(k, _)| k == key)
.map(|(_, v)| v)
}
pub fn remove(&mut self, key: &K) -> Option<V> {
let index = self.bucket_index(key);
let bucket = &mut self.buckets[index];
let pos = bucket.iter().position(|(k, _)| k == key)?;
Some(bucket.swap_remove(pos).1)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn basic_hash_map_works() {
let mut map = HashMap::new();
map.insert("apple", 3);
map.insert("banana", 5);
map.insert("apple", 10);
assert_eq!(map.get(&"apple"), Some(&10));
assert_eq!(map.get(&"banana"), Some(&5));
assert_eq!(map.remove(&"banana"), Some(5));
assert_eq!(map.get(&"banana"), None);
}
}
Example Usage
fn main() {
let mut map = HashMap::new();
map.insert("rust", 1);
map.insert("go", 2);
map.insert("python", 3);
println!("rust → {:?}", map.get(&"rust"));
println!("go → {:?}", map.get(&"go"));
map.remove(&"go");
println!("go → {:?}", map.get(&"go"));
}
Time and Space Complexity
Time Complexity (Average Case)
| Operation | Complexity |
|---|---|
| Insert | O(1) |
| Lookup | O(1) |
| Remove | O(1) |
Worst case degrades to O(n) if all keys collide.
Space Complexity
| Space | Complexity | Reason |
|---|---|---|
| Buckets | O(n) | Store all entries |
Limitations of This Implementation
This hash table is intentionally limited:
- Fixed bucket size
- No resizing
- No load factor control
- No iteration support
These limitations are solved in production-grade hash maps.
Why This Still Matters
Building a hash table yourself teaches:
- Why collisions occur
- How chaining works
- Why load factor matters
- Why resizing is essential
After this, Rust’s HashMap internals make much more sense.
Summary
- Hash tables map keys to buckets using hashing
- Collisions are handled via chaining
- Core operations are straightforward
- Production implementations add resizing and optimizations