Hash Map (Hash Table)

Hash Function: A function that converts a given big phone number to a small practical integer value. The mapped integer value is used as an index in hash table. In simple terms, a hash function maps a big number or string to a small integer that can be used as index in hash table.
A good hash function should have following properties
1) Efficiently computable.
2) Should uniformly distribute the keys (Each table position equally likely for each key)

Hash Table: An array that stores pointers to records corresponding to a given key. An entry in hash table is NIL if no existing key has hash function value equal to the index for the entry.

Collision Handling: Since a hash function gets us a small number for a big key, there is possibility that two keys result in same value. The situation where a newly inserted key maps to an already occupied slot in hash table is called collision and must be handled using some collision handling technique. Following are the ways to handle collisions:

  • Separate Chaining: The idea is to make each cell of hash table point to a linked list of records that have same hash function value. Chaining is simple, but requires additional memory outside the table.
  • Open Addressing: In open addressing, all elements are stored in the hash table itself. Each table entry contains either a record or NIL. When searching for an element, we one by one examine table slots until the desired element is found or it is clear that the element is not in the table.

 

Separate Chaining 

接口和C++里unordered_map一致。chaining的思想就是用list来存储所有相同hashIndex的元素。

这里用了vector<list<HashNode>>,用HashNode *当然也是可以的。

struct HashNode {
    int key;
    int value;
    HashNode(int k, int v):key(k),value(v){}
};

class HashMap {
    vector<list<HashNode>> v;
    int capacity;
public:
    HashMap(int cap){
        capacity = cap;
        v.resize(capacity);
    }
    int hashFunction(int key){
        return key%capacity;
    }
    void insert(int key, int value){
        int hashIndex=hashFunction(key);
        auto &cur_list=v[hashIndex];
        for (auto it=cur_list.begin();it!=cur_list.end();++it){
            if (it->key==key){
                it->value=value;
                return;
            }
        }
        cur_list.emplace_back(key,value);
    }
    void erase(int key){
        int hashIndex=hashFunction(key);
        auto &cur_list=v[hashIndex];
        for (auto it=cur_list.begin();it!=cur_list.end();++it){
            if (it->key==key){
                cur_list.erase(it);
                return;
            }
        }
    }
    int at(int key){
        int hashIndex=hashFunction(key);
        auto &cur_list=v[hashIndex];
        for (auto it=cur_list.begin();it!=cur_list.end();++it){
            if (it->key==key){
                return it->value;
            }
        }
        return INT_MIN; // not found
    }
    void display(){
        for (int i=0;i<capacity;++i){
            cout << i << ": ";
            for (auto x:v[i]) cout<<x.key<<' ';
            cout << endl;
        }
    }
};

int main() {
    HashMap hashmap(7);
    vector<int> to_insert{15,11,27,8};
    for (int x:to_insert) hashmap.insert(x,x);
    hashmap.display();
    cout<<hashmap.at(8);
    hashmap.erase(8);
    hashmap.display();
    cout<<hashmap.at(8);
    return 0;
}

 

Open Addressing (Linear Probing)

insert只需找到hashIndex开始的空位即可。两种情况,一种是NULL,一种是被删除过的节点。还需要注意,如果我们找到了key与待插入key相同的节点,直接修改该节点即可。insert时,size==capacity,是需要rehash的(这里不展开)。

erase和at一样,从hashIndex开始找,找到key相同的节点即可。erase后将节点指向{-1,-1},这是因为如果delete掉置为NULL,之后的erase和at会丢失信息。

例如1, 2, 3的hashIndex一样,依次存储在数组里。如果删除2,置为NULL,之后就无法删除或者查找3了。

struct HashNode {
    int key;
    int value;
    HashNode(int k, int v):key(k),value(v){}
};

class HashMap {
    vector<HashNode *> arr;
    int capacity;
    int size;
    HashNode *deletedNode;
public:
    HashMap(int cap){
        capacity = cap;
        size = 0;
        arr.resize(capacity,NULL);
        deletedNode = new HashNode(-1,-1);
    }
    int hashFunction(int key){
        return key%capacity;
    }
    void insert(int key, int value){
        int hashIndex=hashFunction(key);
        HashNode *tmp=new HashNode(key,value);
        // Keep probing until an empty slot is found. 
        while (arr[hashIndex]!=NULL && arr[hashIndex]->key!=-1 && arr[hashIndex]->key!=key){
            hashIndex = (hashIndex+1)%capacity;
        }
        //if new node to be inserted increase the current size 
        if(arr[hashIndex] == NULL || arr[hashIndex]->key == -1) 
            ++size;
        arr[hashIndex] = tmp; 
        // may need to rehash when size==capacity
    }
    void erase(int key){
        int hashIndex=hashFunction(key);
        //finding the node with given key
        while (arr[hashIndex]!=NULL){
            if (arr[hashIndex]->key==key){
                arr[hashIndex] = deletedNode;
                --size;
                return;
            }      
            hashIndex = (hashIndex+1)%capacity;
        }
    }
    int at(int key){
        int hashIndex=hashFunction(key);
        while (arr[hashIndex]!=NULL){
            if (arr[hashIndex]->key==key){
                return arr[hashIndex]->value;
            }
            hashIndex = (hashIndex+1)%capacity;
        }
        return INT_MIN; // not found
    }
    void display(){
        for (auto x:arr){
            if (x==NULL) cout<<"null"<<' ';
            else cout<<x->key<<' ';
        }
        cout << endl;
    }
};

int main() {
    HashMap hashmap(7);
    vector<int> to_insert{76,93,40,47,10,55};
    for (int x:to_insert) hashmap.insert(x,x);
    hashmap.display();
    cout<<hashmap.at(55);
    hashmap.erase(55);
    hashmap.display();
    cout<<hashmap.at(55);
    return 0;
}

 

Reference

https://www.geeksforgeeks.org/hashing-set-1-introduction/

https://www.geeksforgeeks.org/hashing-set-2-separate-chaining/

https://www.geeksforgeeks.org/c-program-hashing-chaining/

https://www.geeksforgeeks.org/hashing-set-3-open-addressing/

https://www.geeksforgeeks.org/implementing-hash-table-open-addressing-linear-probing-cpp/

posted @ 2019-10-15 10:42  約束の空  阅读(245)  评论(0编辑  收藏  举报