3-7 闭哈希表（开放地址法）

闭哈希表（开放地址法 / Open Addressing）

闭哈希表（Closed Hashing），又称开放地址法（Open Addressing），是处理哈希冲突（Hash Collision）的一种经典策略。其核心思想与开放哈希表截然相反：所有元素直接存储在哈希表数组（Array）本身中，不使用链表。当哈希函数将一个键（Key）映射到的槽位（Slot）已被占用时——即发生冲突——就按照某种探测序列（Probe Sequence）在数组中寻找下一个空闲槽位。因此被称为"开放地址"——元素的最终位置可能不是哈希函数最初计算的位置，而是沿着探测序列"开放"出来的某个地址。

本文重点讲解最基础的线性探测（Linear Probing），并简要介绍二次探测（Quadratic Probing）和双重哈希（Double Hashing）。

核心概念

哈希函数（Hash Function）：将键映射到数组下标的函数，例如 hash(key) = key % TABLE_SIZE。
哈希冲突（Collision）：两个不同的键经过哈希函数计算后得到相同的下标，而该槽位已被占用。
探测（Probing）：当冲突发生时，按照规则在数组中寻找下一个可用槽位的过程。
装载因子（Load Factor）：记作 α = n / m，其中 n 是已存储的元素数量，m 是数组大小。开放地址法要求 α < 1（数组不能存满），否则探测将无法终止。

下面是一个 TABLE_SIZE = 7 的闭哈希表插入过程示意：

插入 10:  hash(10) = 10 % 7 = 3  -> 槽[3]为空，直接放入
插入 22:  hash(22) = 22 % 7 = 1  -> 槽[1]为空，直接放入
插入 31:  hash(31) = 31 % 7 = 3  -> 槽[3]已占用，探测槽[4]，为空，放入
插入  4:  hash(4)  = 4 % 7  = 4  -> 槽[4]已占用，探测槽[5]，为空，放入
插入 15:  hash(15) = 15 % 7 = 1  -> 槽[1]已占用，探测槽[2]，为空，放入
插入 28:  hash(28) = 28 % 7 = 0  -> 槽[0]为空，直接放入

Index:  [0]  [1]  [2]  [3]  [4]  [5]  [6]
Value:   28   22   15   10   31    4

在这个例子中，10、22、28 直接放入其哈希值对应的槽位。31 与 10 冲突（都映射到槽 3），通过线性探测找到了槽 4。4 映射到槽 4 但已被 31 占据，探测到槽 5 放入。15 与 22 冲突（都映射到槽 1），探测到槽 2 放入。

线性探测（Linear Probing）

线性探测（Linear Probing）是最简单的开放地址法。当冲突发生时，从哈希值对应的槽位开始，逐个检查下一个槽位（到达数组末尾则回到开头），直到找到空槽。

探测公式为：

index = (hash(key) + i) % TABLE_SIZE,    i = 0, 1, 2, ...

其中 i 是探测的步数。i = 0 时检查原始哈希位置，i = 1 时检查下一个位置，以此类推。

逐步演示

以 TABLE_SIZE = 7 为例，依次插入 10, 22, 31, 4, 15, 28：

插入 10：hash(10) = 10 % 7 = 3，槽 3 为空，放入 10。

[0]    [1]    [2]    [3]    [4]    [5]    [6]
                       10

插入 22：hash(22) = 22 % 7 = 1，槽 1 为空，放入 22。

[0]    [1]    [2]    [3]    [4]    [5]    [6]
       22            10

插入 31：hash(31) = 31 % 7 = 3，槽 3 已被 10 占用（冲突）。线性探测：检查槽 4，为空，放入 31。

[0]    [1]    [2]    [3]    [4]    [5]    [6]
       22            10     31

插入 4：hash(4) = 4 % 7 = 4，槽 4 已被 31 占用（冲突）。线性探测：检查槽 5，为空，放入 4。

[0]    [1]    [2]    [3]    [4]    [5]    [6]
       22            10     31      4

插入 15：hash(15) = 15 % 7 = 1，槽 1 已被 22 占用（冲突）。线性探测：检查槽 2，为空，放入 15。

[0]    [1]    [2]    [3]    [4]    [5]    [6]
       22     15     10     31      4

插入 28：hash(28) = 28 % 7 = 0，槽 0 为空，放入 28。

[0]    [1]    [2]    [3]    [4]    [5]    [6]
 28     22     15     10     31      4

可以看到，由于 10 和 31 都映射到槽 3，31 被挤到了槽 4，而 4 本应放在槽 4，又被挤到了槽 5。这种一个冲突引发后续连锁冲突的现象称为一次聚集（Primary Clustering），是线性探测的主要缺点。

二次探测和双重哈希

线性探测虽然简单，但一次聚集问题严重。本节简要介绍两种改进的探测方法。

二次探测（Quadratic Probing）

二次探测使用二次函数作为探测步长，避免线性探测的聚集问题：

index = (hash(key) + c1*i + c2*i^2) % TABLE_SIZE,    i = 0, 1, 2, ...

最常见的形式是 c1 = 0, c2 = 1，即：

index = (hash(key) + i^2) % TABLE_SIZE

探测顺序为：h, h+1, h+4, h+9, h+16, ...

二次探测能够有效减少一次聚集，因为探测的步长逐渐增大，不会像线性探测那样连续占用相邻槽位。但二次探测可能产生二次聚集（Secondary Clustering）——哈希到同一位置的不同键会沿完全相同的探测序列查找。此外，二次探测不保证能访问到所有槽位（除非 TABLE_SIZE 是质数且满足特定条件）。

双重哈希（Double Hashing）

双重哈希使用第二个哈希函数计算探测步长，使得不同键的探测序列各不相同：

index = (hash1(key) + i * hash2(key)) % TABLE_SIZE

例如：

hash1(key) = key % TABLE_SIZE
hash2(key) = 1 + (key % (TABLE_SIZE - 1))    // 步长函数，保证 >= 1

探测顺序为：h1, h1+h2, h1+2*h2, h1+3*h2, ...

双重哈希的探测序列依赖于键本身，因此不同键即使 hash1 相同，hash2 也大概率不同，从而产生不同的探测路径。这极大地减少了聚集现象，是开放地址法中理论上最优的探测策略。缺点是需要计算两次哈希函数，且 hash2 不能返回 0（否则探测将原地不动）。

数据结构定义

闭哈希表的所有元素直接存储在数组中。与开放哈希表不同，闭哈希表需要区分三种槽位状态：

EMPTY（空）：该槽位从未被使用过，搜索时遇到 EMPTY 意味着键一定不存在。
OCCUPIED（已占用）：该槽位当前存储了一个键。
DELETED（已删除）：该槽位曾经存储过键但已被删除。搜索时遇到 DELETED 不能停止，需要继续探测。

引入 DELETED 状态是因为：如果直接将槽位标记为 EMPTY，则在该槽位之后的同族元素（即因冲突被探测到更后位置的元素）将无法被搜索到。这就是懒删除（Lazy Deletion）策略。

// C++ HashTable class definition
const int TABLE_SIZE = 7;

enum SlotState { EMPTY, OCCUPIED, DELETED };

class HashTable {
private:
    int* table;          // Array storing keys
    SlotState* flags;    // Array storing slot states
    int capacity;        // Number of slots
    int count;           // Number of stored elements

    int hashFunction(int key) {
        return key % capacity;
    }

public:
    HashTable(int size = TABLE_SIZE);
    ~HashTable();
    void insert(int key);
    bool search(int key);
    void remove(int key);
    void display();
};

// C HashTable struct definition
#define TABLE_SIZE 7

typedef enum { EMPTY, OCCUPIED, DELETED } SlotState;

typedef struct {
    int* table;          // Array storing keys
    SlotState* flags;    // Array storing slot states
    int capacity;        // Number of slots
    int count;           // Number of stored elements
} HashTable;

# Python HashTable class definition
TABLE_SIZE = 7

# Slot states
EMPTY = 0
OCCUPIED = 1
DELETED = 2

class HashTable:
    """Hash table using open addressing with linear probing."""
    def __init__(self, size=TABLE_SIZE):
        self.capacity = size
        self.count = 0
        self.table = [0] * size        # Array storing keys
        self.flags = [EMPTY] * size    # Array storing slot states

// Go HashTable struct definition
package main

import "fmt"

const TABLE_SIZE = 7

const (
	EMPTY    = 0
	OCCUPIED = 1
	DELETED  = 2
)

// HashTable represents a closed hash table using open addressing
type HashTable struct {
	table    []int // slice storing keys
	flags    []int // slice storing slot states: EMPTY / OCCUPIED / DELETED
	capacity int   // number of slots
	count    int   // number of stored elements
}

func newHashTable(size int) *HashTable {
	return &HashTable{
		capacity: size,
		count:    0,
		table:    make([]int, size),
		flags:    make([]int, size), // zero-initialized, all EMPTY
	}
}

func (ht *HashTable) hash(key int) int {
	return key % ht.capacity
}

C++ 使用枚举（enum）SlotState 表示三种状态，int* table 存储键值，SlotState* flags 存储对应槽位的状态。C 语言用 typedef enum 定义等价的结构。Python 使用常量 EMPTY = 0, OCCUPIED = 1, DELETED = 2 和两个列表分别存储键和状态。Go 使用 const 定义三种状态常量，切片 []int 分别存储键和状态，切片的零值 0 恰好对应 EMPTY 状态，因此无需显式初始化。

插入操作

插入（Insertion）的步骤如下：

用哈希函数计算起始位置 index = key % capacity。
从 index 开始线性探测：检查每个槽位，跳过 OCCUPIED 且键不同的槽位。
如果找到 EMPTY 或 DELETED 槽位，将键存入该槽并标记为 OCCUPIED。
如果所有槽位都被占用（表满），插入失败。

// C++ Insert: add a key to the hash table
void HashTable::insert(int key) {
    if (count == capacity) {
        cout << "Hash table is full, cannot insert " << key << endl;
        return;
    }

    int index = hashFunction(key);
    int startIndex = index;

    do {
        if (flags[index] == EMPTY || flags[index] == DELETED) {
            // Found an available slot
            table[index] = key;
            flags[index] = OCCUPIED;
            count++;
            return;
        }
        if (flags[index] == OCCUPIED && table[index] == key) {
            // Key already exists, no duplicate insertion
            return;
        }
        // Slot is occupied by a different key, probe next
        index = (index + 1) % capacity;
    } while (index != startIndex);
}

// C Insert: add a key to the hash table
void insert(HashTable* ht, int key) {
    if (ht->count == ht->capacity) {
        printf("Hash table is full, cannot insert %d\n", key);
        return;
    }

    int index = key % ht->capacity;
    int startIndex = index;

    do {
        if (ht->flags[index] == EMPTY || ht->flags[index] == DELETED) {
            ht->table[index] = key;
            ht->flags[index] = OCCUPIED;
            ht->count++;
            return;
        }
        if (ht->flags[index] == OCCUPIED && ht->table[index] == key) {
            return;  // Key already exists
        }
        index = (index + 1) % ht->capacity;
    } while (index != startIndex);
}

# Python Insert: add a key to the hash table
def insert(self, key):
    if self.count == self.capacity:
        print(f"Hash table is full, cannot insert {key}")
        return

    index = key % self.capacity
    start_index = index

    while True:
        if self.flags[index] == EMPTY or self.flags[index] == DELETED:
            # Found an available slot
            self.table[index] = key
            self.flags[index] = OCCUPIED
            self.count += 1
            return
        if self.flags[index] == OCCUPIED and self.table[index] == key:
            # Key already exists
            return
        # Probe next slot
        index = (index + 1) % self.capacity
        if index == start_index:
            break

// Go Insert: add a key to the hash table
func (ht *HashTable) insert(key int) {
	if ht.count == ht.capacity {
		fmt.Printf("Hash table is full, cannot insert %d\n", key)
		return
	}

	idx := ht.hash(key)
	startIdx := idx

	for {
		if ht.flags[idx] == EMPTY || ht.flags[idx] == DELETED {
			// Found an available slot
			ht.table[idx] = key
			ht.flags[idx] = OCCUPIED
			ht.count++
			return
		}
		if ht.flags[idx] == OCCUPIED && ht.table[idx] == key {
			// Key already exists, no duplicate insertion
			return
		}
		// Slot is occupied by a different key, probe next
		idx = (idx + 1) % ht.capacity
		if idx == startIdx {
			break
		}
	}
}

插入操作从哈希位置开始线性探测。遇到 OCCUPIED 且键不同的槽位就继续探测下一个位置，遇到 EMPTY 或 DELETED 就立即存入。C++/C 使用 do-while 循环，Python 使用 while True，Go 使用 for {} 无限循环配合 break——三者逻辑等价，均确保最多探测一整圈。如果键已存在则跳过，避免重复插入。

搜索操作

搜索（Search）的步骤如下：

计算起始位置 index = key % capacity。
线性探测：遇到 OCCUPIED 且键匹配则找到；遇到 EMPTY 则停止（键一定不存在）；遇到 DELETED 则跳过继续探测。
探测一整圈未找到则键不存在。

关键区别：遇到 DELETED 必须继续探测，因为目标键可能被冲突挤到了更后面的位置。

// C++ Search: find a key, return true if found
bool HashTable::search(int key) {
    int index = hashFunction(key);
    int startIndex = index;

    do {
        if (flags[index] == EMPTY) {
            // Empty slot means key definitely not in table
            return false;
        }
        if (flags[index] == OCCUPIED && table[index] == key) {
            return true;  // Found the key
        }
        // OCCUPIED with different key or DELETED: keep probing
        index = (index + 1) % capacity;
    } while (index != startIndex);

    return false;  // Full loop, key not found
}

// C Search: find a key, return 1 if found, 0 otherwise
int search(HashTable* ht, int key) {
    int index = key % ht->capacity;
    int startIndex = index;

    do {
        if (ht->flags[index] == EMPTY) {
            return 0;
        }
        if (ht->flags[index] == OCCUPIED && ht->table[index] == key) {
            return 1;
        }
        index = (index + 1) % ht->capacity;
    } while (index != startIndex);

    return 0;
}

# Python Search: find a key, return True if found
def search(self, key):
    index = key % self.capacity
    start_index = index

    while True:
        if self.flags[index] == EMPTY:
            # Empty slot means key definitely not in table
            return False
        if self.flags[index] == OCCUPIED and self.table[index] == key:
            return True
        # OCCUPIED with different key or DELETED: keep probing
        index = (index + 1) % self.capacity
        if index == start_index:
            break

    return False

// Go Search: find a key, return true if found
func (ht *HashTable) search(key int) bool {
	idx := ht.hash(key)
	startIdx := idx

	for {
		if ht.flags[idx] == EMPTY {
			// Empty slot means key definitely not in table
			return false
		}
		if ht.flags[idx] == OCCUPIED && ht.table[idx] == key {
			return true // Found the key
		}
		// OCCUPIED with different key or DELETED: keep probing
		idx = (idx + 1) % ht.capacity
		if idx == startIdx {
			break
		}
	}

	return false // Full loop, key not found
}

搜索时遇到 EMPTY 可以立即判定键不存在——因为如果该键曾被插入并探测到此处之后的位置，这个 EMPTY 槽位必然会被经过或占据。但遇到 DELETED 不能停止，因为目标键可能是在该删除位置之后被插入的。

删除操作

删除（Deletion）不能简单地将槽位标记为 EMPTY。如果这样做，在该删除位置之后的同族元素（因冲突被探测到更后面位置的键）将变得不可达——搜索时会在 EMPTY 槽位提前停止。因此必须使用懒删除（Lazy Deletion）：将槽位标记为 DELETED 而非 EMPTY。

// C++ Remove: delete a key using lazy deletion
void HashTable::remove(int key) {
    int index = hashFunction(key);
    int startIndex = index;

    do {
        if (flags[index] == EMPTY) {
            // Key not found
            cout << "Key " << key << " not found" << endl;
            return;
        }
        if (flags[index] == OCCUPIED && table[index] == key) {
            // Found the key, mark as DELETED
            flags[index] = DELETED;
            count--;
            cout << "Deleted key " << key << endl;
            return;
        }
        index = (index + 1) % capacity;
    } while (index != startIndex);

    cout << "Key " << key << " not found" << endl;
}

// C Remove: delete a key using lazy deletion
void removeKey(HashTable* ht, int key) {
    int index = key % ht->capacity;
    int startIndex = index;

    do {
        if (ht->flags[index] == EMPTY) {
            printf("Key %d not found\n", key);
            return;
        }
        if (ht->flags[index] == OCCUPIED && ht->table[index] == key) {
            ht->flags[index] = DELETED;
            ht->count--;
            printf("Deleted key %d\n", key);
            return;
        }
        index = (index + 1) % ht->capacity;
    } while (index != startIndex);

    printf("Key %d not found\n", key);
}

# Python Remove: delete a key using lazy deletion
def remove(self, key):
    index = key % self.capacity
    start_index = index

    while True:
        if self.flags[index] == EMPTY:
            print(f"Key {key} not found")
            return
        if self.flags[index] == OCCUPIED and self.table[index] == key:
            # Mark as DELETED instead of EMPTY
            self.flags[index] = DELETED
            self.count -= 1
            print(f"Deleted key {key}")
            return
        index = (index + 1) % self.capacity
        if index == start_index:
            break

    print(f"Key {key} not found")

// Go Remove: delete a key using lazy deletion
func (ht *HashTable) remove(key int) {
	idx := ht.hash(key)
	startIdx := idx

	for {
		if ht.flags[idx] == EMPTY {
			// Key not found
			fmt.Printf("Key %d not found\n", key)
			return
		}
		if ht.flags[idx] == OCCUPIED && ht.table[idx] == key {
			// Found the key, mark as DELETED
			ht.flags[idx] = DELETED
			ht.count--
			fmt.Printf("Deleted key %d\n", key)
			return
		}
		idx = (idx + 1) % ht.capacity
		if idx == startIdx {
			break
		}
	}

	fmt.Printf("Key %d not found\n", key)
}

删除操作的探测逻辑与搜索类似：沿探测序列查找目标键。找到后不释放内存也不清空数据，仅将状态标记改为 DELETED。这样既不会破坏探测链的完整性，又为插入操作提供了可复用的槽位（DELETED 槽位可以重新写入新键）。

完整实现

下面提供完整的闭哈希表实现，使用线性探测，包含插入、搜索、删除和显示操作，整合为可独立运行的程序。

#include <iostream>
using namespace std;

const int TABLE_SIZE = 7;

enum SlotState { EMPTY, OCCUPIED, DELETED };

class HashTable {
private:
    int* table;
    SlotState* flags;
    int capacity;
    int count;

    int hashFunction(int key) {
        return key % capacity;
    }

public:
    HashTable(int size = TABLE_SIZE) : capacity(size), count(0) {
        table = new int[capacity];
        flags = new SlotState[capacity];
        for (int i = 0; i < capacity; i++) {
            flags[i] = EMPTY;
        }
    }

    ~HashTable() {
        delete[] table;
        delete[] flags;
    }

    void insert(int key) {
        if (count == capacity) {
            cout << "Hash table is full, cannot insert " << key << endl;
            return;
        }

        int index = hashFunction(key);
        int startIndex = index;

        do {
            if (flags[index] == EMPTY || flags[index] == DELETED) {
                table[index] = key;
                flags[index] = OCCUPIED;
                count++;
                return;
            }
            if (flags[index] == OCCUPIED && table[index] == key) {
                return;  // Duplicate
            }
            index = (index + 1) % capacity;
        } while (index != startIndex);
    }

    bool search(int key) {
        int index = hashFunction(key);
        int startIndex = index;

        do {
            if (flags[index] == EMPTY) {
                return false;
            }
            if (flags[index] == OCCUPIED && table[index] == key) {
                return true;
            }
            index = (index + 1) % capacity;
        } while (index != startIndex);

        return false;
    }

    void remove(int key) {
        int index = hashFunction(key);
        int startIndex = index;

        do {
            if (flags[index] == EMPTY) {
                cout << "Key " << key << " not found" << endl;
                return;
            }
            if (flags[index] == OCCUPIED && table[index] == key) {
                flags[index] = DELETED;
                count--;
                cout << "Deleted key " << key << endl;
                return;
            }
            index = (index + 1) % capacity;
        } while (index != startIndex);

        cout << "Key " << key << " not found" << endl;
    }

    void display() {
        for (int i = 0; i < capacity; i++) {
            cout << "[" << i << "] ";
            if (flags[i] == EMPTY) {
                cout << "EMPTY";
            } else if (flags[i] == DELETED) {
                cout << "DELETED";
            } else {
                cout << table[i];
            }
            cout << endl;
        }
    }
};

int main() {
    HashTable ht;

    // Insert keys
    cout << "=== Inserting ===" << endl;
    int keys[] = {10, 22, 31, 4, 15, 28};
    for (int k : keys) {
        cout << "Insert " << k << ": hash(" << k << ") = " << k % 7 << endl;
        ht.insert(k);
    }

    // Display
    cout << "\n=== Hash Table ===" << endl;
    ht.display();

    // Search
    cout << "\n=== Search ===" << endl;
    int targets[] = {31, 17, 15};
    for (int t : targets) {
        if (ht.search(t)) {
            cout << "Key " << t << " found" << endl;
        } else {
            cout << "Key " << t << " not found" << endl;
        }
    }

    // Delete
    cout << "\n=== Delete ===" << endl;
    ht.remove(22);
    ht.remove(17);

    // Display after deletion
    cout << "\n=== After Deletion ===" << endl;
    ht.display();

    return 0;
}

#include <stdio.h>
#include <stdlib.h>

#define TABLE_SIZE 7

typedef enum { EMPTY, OCCUPIED, DELETED } SlotState;

typedef struct {
    int* table;
    SlotState* flags;
    int capacity;
    int count;
} HashTable;

HashTable* createHashTable(int size) {
    HashTable* ht = (HashTable*)malloc(sizeof(HashTable));
    if (!ht) { exit(1); }
    ht->capacity = size;
    ht->count = 0;
    ht->table = (int*)malloc(size * sizeof(int));
    ht->flags = (SlotState*)malloc(size * sizeof(SlotState));
    if (!ht->table || !ht->flags) { exit(1); }
    for (int i = 0; i < size; i++) {
        ht->flags[i] = EMPTY;
    }
    return ht;
}

void destroyHashTable(HashTable* ht) {
    free(ht->table);
    free(ht->flags);
    free(ht);
}

void insert(HashTable* ht, int key) {
    if (ht->count == ht->capacity) {
        printf("Hash table is full, cannot insert %d\n", key);
        return;
    }

    int index = key % ht->capacity;
    int startIndex = index;

    do {
        if (ht->flags[index] == EMPTY || ht->flags[index] == DELETED) {
            ht->table[index] = key;
            ht->flags[index] = OCCUPIED;
            ht->count++;
            return;
        }
        if (ht->flags[index] == OCCUPIED && ht->table[index] == key) {
            return;
        }
        index = (index + 1) % ht->capacity;
    } while (index != startIndex);
}

int search(HashTable* ht, int key) {
    int index = key % ht->capacity;
    int startIndex = index;

    do {
        if (ht->flags[index] == EMPTY) {
            return 0;
        }
        if (ht->flags[index] == OCCUPIED && ht->table[index] == key) {
            return 1;
        }
        index = (index + 1) % ht->capacity;
    } while (index != startIndex);

    return 0;
}

void removeKey(HashTable* ht, int key) {
    int index = key % ht->capacity;
    int startIndex = index;

    do {
        if (ht->flags[index] == EMPTY) {
            printf("Key %d not found\n", key);
            return;
        }
        if (ht->flags[index] == OCCUPIED && ht->table[index] == key) {
            ht->flags[index] = DELETED;
            ht->count--;
            printf("Deleted key %d\n", key);
            return;
        }
        index = (index + 1) % ht->capacity;
    } while (index != startIndex);

    printf("Key %d not found\n", key);
}

void display(HashTable* ht) {
    for (int i = 0; i < ht->capacity; i++) {
        printf("[%d] ", i);
        if (ht->flags[i] == EMPTY) {
            printf("EMPTY");
        } else if (ht->flags[i] == DELETED) {
            printf("DELETED");
        } else {
            printf("%d", ht->table[i]);
        }
        printf("\n");
    }
}

int main() {
    HashTable* ht = createHashTable(TABLE_SIZE);

    printf("=== Inserting ===\n");
    int keys[] = {10, 22, 31, 4, 15, 28};
    for (int i = 0; i < 6; i++) {
        printf("Insert %d: hash(%d) = %d\n", keys[i], keys[i], keys[i] % 7);
        insert(ht, keys[i]);
    }

    printf("\n=== Hash Table ===\n");
    display(ht);

    printf("\n=== Search ===\n");
    int targets[] = {31, 17, 15};
    for (int i = 0; i < 3; i++) {
        if (search(ht, targets[i])) {
            printf("Key %d found\n", targets[i]);
        } else {
            printf("Key %d not found\n", targets[i]);
        }
    }

    printf("\n=== Delete ===\n");
    removeKey(ht, 22);
    removeKey(ht, 17);

    printf("\n=== After Deletion ===\n");
    display(ht);

    destroyHashTable(ht);
    return 0;
}

TABLE_SIZE = 7

EMPTY = 0
OCCUPIED = 1
DELETED = 2

class HashTable:
    """Hash table using open addressing with linear probing."""
    def __init__(self, size=TABLE_SIZE):
        self.capacity = size
        self.count = 0
        self.table = [0] * size
        self.flags = [EMPTY] * size

    def _hash(self, key):
        return key % self.capacity

    def insert(self, key):
        if self.count == self.capacity:
            print(f"Hash table is full, cannot insert {key}")
            return

        index = self._hash(key)
        start_index = index

        while True:
            if self.flags[index] == EMPTY or self.flags[index] == DELETED:
                self.table[index] = key
                self.flags[index] = OCCUPIED
                self.count += 1
                return
            if self.flags[index] == OCCUPIED and self.table[index] == key:
                return  # Duplicate
            index = (index + 1) % self.capacity
            if index == start_index:
                break

    def search(self, key):
        index = self._hash(key)
        start_index = index

        while True:
            if self.flags[index] == EMPTY:
                return False
            if self.flags[index] == OCCUPIED and self.table[index] == key:
                return True
            index = (index + 1) % self.capacity
            if index == start_index:
                break

        return False

    def remove(self, key):
        index = self._hash(key)
        start_index = index

        while True:
            if self.flags[index] == EMPTY:
                print(f"Key {key} not found")
                return
            if self.flags[index] == OCCUPIED and self.table[index] == key:
                self.flags[index] = DELETED
                self.count -= 1
                print(f"Deleted key {key}")
                return
            index = (index + 1) % self.capacity
            if index == start_index:
                break

        print(f"Key {key} not found")

    def display(self):
        for i in range(self.capacity):
            state = {EMPTY: "EMPTY", OCCUPIED: str(self.table[i]), DELETED: "DELETED"}
            print(f"[{i}] {state[self.flags[i]]}")

if __name__ == "__main__":
    ht = HashTable()

    print("=== Inserting ===")
    for k in [10, 22, 31, 4, 15, 28]:
        print(f"Insert {k}: hash({k}) = {k % 7}")
        ht.insert(k)

    print("\n=== Hash Table ===")
    ht.display()

    print("\n=== Search ===")
    for t in [31, 17, 15]:
        if ht.search(t):
            print(f"Key {t} found")
        else:
            print(f"Key {t} not found")

    print("\n=== Delete ===")
    ht.remove(22)
    ht.remove(17)

    print("\n=== After Deletion ===")
    ht.display()

Go 语言使用 const 定义槽位状态常量（EMPTY/OCCUPIED/DELETED），用切片（slice）分别存储键和状态。与 C/C++ 的枚举不同，Go 用 iota 或显式常量表示状态。线性探测的循环使用 for 语句实现，逻辑与 C 的 do-while 等价。

package main

import "fmt"

const TABLE_SIZE = 7

const (
	EMPTY    = 0
	OCCUPIED = 1
	DELETED  = 2
)

// HashTable 闭哈希表，使用开放地址法（线性探测）处理冲突
type HashTable struct {
	capacity int
	count    int
	table    []int // 存储键的数组
	flags    []int // 存储槽位状态：EMPTY / OCCUPIED / DELETED
}

func newHashTable(size int) *HashTable {
	return &HashTable{
		capacity: size,
		count:    0,
		table:    make([]int, size),
		flags:    make([]int, size), // 默认值为 0，即 EMPTY
	}
}

func (ht *HashTable) hash(key int) int {
	return key % ht.capacity
}

// Insert 插入键到哈希表，使用线性探测解决冲突
func (ht *HashTable) insert(key int) {
	if ht.count == ht.capacity {
		fmt.Printf("Hash table is full, cannot insert %d\n", key)
		return
	}

	idx := ht.hash(key)
	startIdx := idx

	for {
		if ht.flags[idx] == EMPTY || ht.flags[idx] == DELETED {
			ht.table[idx] = key
			ht.flags[idx] = OCCUPIED
			ht.count++
			return
		}
		if ht.flags[idx] == OCCUPIED && ht.table[idx] == key {
			return // 键已存在，不重复插入
		}
		idx = (idx + 1) % ht.capacity
		if idx == startIdx {
			break
		}
	}
}

// Search 查找键，返回是否找到
func (ht *HashTable) search(key int) bool {
	idx := ht.hash(key)
	startIdx := idx

	for {
		if ht.flags[idx] == EMPTY {
			return false // 空槽位意味着键一定不存在
		}
		if ht.flags[idx] == OCCUPIED && ht.table[idx] == key {
			return true
		}
		idx = (idx + 1) % ht.capacity
		if idx == startIdx {
			break
		}
	}

	return false
}

// Remove 删除键，使用懒删除（标记为 DELETED）
func (ht *HashTable) remove(key int) {
	idx := ht.hash(key)
	startIdx := idx

	for {
		if ht.flags[idx] == EMPTY {
			fmt.Printf("Key %d not found\n", key)
			return
		}
		if ht.flags[idx] == OCCUPIED && ht.table[idx] == key {
			ht.flags[idx] = DELETED
			ht.count--
			fmt.Printf("Deleted key %d\n", key)
			return
		}
		idx = (idx + 1) % ht.capacity
		if idx == startIdx {
			break
		}
	}

	fmt.Printf("Key %d not found\n", key)
}

// Display 打印哈希表所有槽位及其状态
func (ht *HashTable) display() {
	for i := 0; i < ht.capacity; i++ {
		fmt.Printf("[%d] ", i)
		switch ht.flags[i] {
		case EMPTY:
			fmt.Print("EMPTY")
		case DELETED:
			fmt.Print("DELETED")
		default:
			fmt.Print(ht.table[i])
		}
		fmt.Println()
	}
}

func main() {
	ht := newHashTable(TABLE_SIZE)

	// 插入键
	fmt.Println("=== Inserting ===")
	keys := []int{10, 22, 31, 4, 15, 28}
	for _, k := range keys {
		fmt.Printf("Insert %d: hash(%d) = %d\n", k, k, k%7)
		ht.insert(k)
	}

	// 显示哈希表
	fmt.Println("\n=== Hash Table ===")
	ht.display()

	// 搜索键
	fmt.Println("\n=== Search ===")
	targets := []int{31, 17, 15}
	for _, t := range targets {
		if ht.search(t) {
			fmt.Printf("Key %d found\n", t)
		} else {
			fmt.Printf("Key %d not found\n", t)
		}
	}

	// 删除键
	fmt.Println("\n=== Delete ===")
	ht.remove(22)
	ht.remove(17)

	// 删除后显示
	fmt.Println("\n=== After Deletion ===")
	ht.display()
}

Go 使用 const 定义三种槽位状态，[]int 切片分别存储键和状态。Go 切片的零值为 0，恰好对应 EMPTY 状态，因此无需显式初始化。删除操作将槽位标记为 DELETED 而非 EMPTY，保证探测链的完整性——与 C/C++ 的懒删除策略一致。

运行该程序将输出：

=== Inserting ===
Insert 10: hash(10) = 3
Insert 22: hash(22) = 1
Insert 31: hash(31) = 3
Insert 4: hash(4) = 4
Insert 15: hash(15) = 1
Insert 28: hash(28) = 0

=== Hash Table ===
[0] 28
[1] 22
[2] 15
[3] 10
[4] 31
[5] 4
[6] EMPTY

=== Search ===
Key 31 found
Key 17 not found
Key 15 found

=== Delete ===
Deleted key 22
Key 17 not found

=== After Deletion ===
[0] 28
[1] DELETED
[2] 15
[3] 10
[4] 31
[5] 4
[6] EMPTY

可以看到：

插入 {10, 22, 31, 4, 15, 28} 后，槽位分布与前文逐步演示的结果完全一致。10 在槽 3，22 在槽 1，31 因与 10 冲突被探测到槽 4，4 因槽 4 已占用被探测到槽 5，15 因与 22 冲突被探测到槽 2，28 在槽 0。
搜索 31 从槽 3 开始探测，跳过 10，在槽 4 找到。搜索 17 从槽 3 开始（17 % 7 = 3），一直探测到空槽 6，判定不存在。搜索 15 从槽 1 开始探测，跳过 22，在槽 2 找到。
删除 22 后，槽 1 被标记为 DELETED 而非 EMPTY，保证后续搜索 15 时能正确跳过槽 1 继续探测到槽 2。

性能分析

下表总结了三种开放地址法策略的对比：

特性	线性探测（Linear Probing）	二次探测（Quadratic Probing）	双重哈希（Double Hashing）
探测公式	`(h + i) % m`	`(h + i^2) % m`	`(h1 + i*h2) % m`
聚集问题	一次聚集（Primary Clustering）严重	二次聚集（Secondary Clustering）较轻	几乎无聚集
缓存性能	最好（连续内存访问）	较好	较差（跳跃访问）
实现复杂度	最简单	中等	较复杂
空间利用	可能无法利用所有槽位	TABLE_SIZE 为质数时可利用所有槽位	当 h2 与 m 互质时可利用所有槽位
插入（平均）	O(1 / (1 - α))	O(1 / (1 - α))	O(1 / (1 - α))
搜索（平均）	O((1 + 1/(1-α)) / 2)	O(1 / (1 - α))	O(1 / (1 - α))

其中 α 是装载因子（Load Factor），α = n / m。

关键说明：

装载因子限制：开放地址法要求 α < 1（数组不能满），否则探测将无法终止。实践中通常将装载因子控制在 0.5 ~ 0.7 以下。当 α 接近 1 时，探测次数急剧增加，性能急剧下降。
一次聚集：线性探测的最大问题。当连续一段槽位被占用时（形成一个聚集块），新插入的键只要哈希到这个块中的任何位置，都会沿探测序列一直走到块的末尾，使聚集块越来越大。这导致探测次数的增长速度远超装载因子的增长。
懒删除的影响：DELETED 槽位不释放空间但不算入元素计数，因此实际可用槽位比 (capacity - count) 更少。如果频繁插入和删除，DELETED 槽位会积累，导致探测效率降低。解决方案是定期清理（Rehash）或将 DELETED 槽位在插入时复用（本文的实现已支持后者）。
扩容（Rehashing）：当装载因子超过阈值时，创建一个更大的数组（通常是原来的两倍大小），然后将所有元素重新哈希到新数组中。扩容后所有 DELETED 标记被清除，探测效率恢复到最优状态。
空间复杂度：O(m)，其中 m 是数组大小。与开放哈希表不同，闭哈希表不需要额外的指针开销，数据直接存储在连续数组中，对缓存更友好。

与开放哈希表（闭地址法）的比较：

特性	开放哈希表（Separate Chaining）	闭哈希表（Open Addressing）
冲突处理	链表存储冲突元素	在数组内探测下一个空位
装载因子	可超过 1.0	必须小于 1.0
删除操作	直接删除链表节点	需要懒删除（Lazy Deletion）
缓存性能	较差（链表节点不连续）	较好（数据在连续数组中）
内存开销	每个节点额外存储指针	无额外指针开销
最坏情况	O(n)（所有键冲突到同一桶）	O(n)（探测一整圈）
适用场景	元素数量不确定、频繁删除	元素大小固定、内存受限、需要缓存友好

在实际应用中，Python 的 dict 和 Rust 的 HashMap 使用开放地址法（Python 3.6+ 使用一种改进的开放地址法方案）。C++ 的 std::unordered_map 和 Java 的 HashMap 则使用分离链接法。选择哪种方案取决于具体的使用场景：如果键的大小较小、内存需要紧凑且缓存友好，开放地址法更优；如果元素数量变化大或删除操作频繁，分离链接法更灵活。

posted @ 2026-04-16 17:51 游翔阅读(10) 评论(0) 收藏举报

刷新页面返回顶部

learncpp