Custom Hash Table Implementation

To truly understand how hash tables work, it’s useful to build a simple implementation from scratch. This section walks through a minimal hash table that supports insertion, lookup, and deletion using separate chaining.

This implementation prioritizes clarity over performance.


Design Goals

This educational hash table will:

  • Use separate chaining to handle collisions

  • Use a fixed-size bucket array

  • Support basic operations:

    • Insert
    • Get
    • Remove

What it will not do:

  • Resize dynamically
  • Use a cryptographically secure hash
  • Optimize for performance

High-Level Structure

Buckets (Vec)
┌─────┐
│  0  │ → [(k1, v1)]
│  1  │ → [(k2, v2), (k3, v3)]
│  2  │ → []
│  3  │ → [(k4, v4)]
└─────┘

Each bucket is a small list of key-value pairs.


Hash Function (Simplified)

For educational purposes, we’ll use Rust’s Hash trait combined with a modulo operation:

index = hash(key) % bucket_count

Rust Implementation

use std::collections::hash_map::DefaultHasher;
use std::hash::{Hash, Hasher};

const DEFAULT_BUCKETS: usize = 16;

#[derive(Debug)]
pub struct HashMap<K, V> {
    buckets: Vec<Vec<(K, V)>>,
}

impl<K: Eq + Hash, V> HashMap<K, V> {
    pub fn new() -> Self {
        HashMap {
            buckets: vec![Vec::new(); DEFAULT_BUCKETS],
        }
    }

    fn bucket_index(&self, key: &K) -> usize {
        let mut hasher = DefaultHasher::new();
        key.hash(&mut hasher);
        (hasher.finish() as usize) % self.buckets.len()
    }

    pub fn insert(&mut self, key: K, value: V) {
        let index = self.bucket_index(&key);
        let bucket = &mut self.buckets[index];

        for &mut (ref existing_key, ref mut existing_value) in bucket.iter_mut() {
            if existing_key == &key {
                *existing_value = value;
                return;
            }
        }

        bucket.push((key, value));
    }

    pub fn get(&self, key: &K) -> Option<&V> {
        let index = self.bucket_index(key);
        self.buckets[index]
            .iter()
            .find(|(k, _)| k == key)
            .map(|(_, v)| v)
    }

    pub fn remove(&mut self, key: &K) -> Option<V> {
        let index = self.bucket_index(key);
        let bucket = &mut self.buckets[index];

        let pos = bucket.iter().position(|(k, _)| k == key)?;
        Some(bucket.swap_remove(pos).1)
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn basic_hash_map_works() {
        let mut map = HashMap::new();

        map.insert("apple", 3);
        map.insert("banana", 5);
        map.insert("apple", 10);

        assert_eq!(map.get(&"apple"), Some(&10));
        assert_eq!(map.get(&"banana"), Some(&5));

        assert_eq!(map.remove(&"banana"), Some(5));
        assert_eq!(map.get(&"banana"), None);
    }
}

Example Usage

fn main() {
    let mut map = HashMap::new();

    map.insert("rust", 1);
    map.insert("go", 2);
    map.insert("python", 3);

    println!("rust → {:?}", map.get(&"rust"));
    println!("go → {:?}", map.get(&"go"));

    map.remove(&"go");
    println!("go → {:?}", map.get(&"go"));
}

Time and Space Complexity

Time Complexity (Average Case)

OperationComplexity
InsertO(1)
LookupO(1)
RemoveO(1)

Worst case degrades to O(n) if all keys collide.


Space Complexity

SpaceComplexityReason
BucketsO(n)Store all entries

Limitations of This Implementation

This hash table is intentionally limited:

  • Fixed bucket size
  • No resizing
  • No load factor control
  • No iteration support

These limitations are solved in production-grade hash maps.


Why This Still Matters

Building a hash table yourself teaches:

  • Why collisions occur
  • How chaining works
  • Why load factor matters
  • Why resizing is essential

After this, Rust’s HashMap internals make much more sense.


Summary

  • Hash tables map keys to buckets using hashing
  • Collisions are handled via chaining
  • Core operations are straightforward
  • Production implementations add resizing and optimizations

Rust - Custom Hash Table Implementation