Home » Data Structure Tutorial

# Hashing | Open addressing for collision handling

**Open addressing for collision handling**: In this article are we are going to learn about the open addressing for collision handling which can be further divided into linear probing, quadratic probing, and double hashing.

Submitted by Radib Kar, on July 01, 2020

**Prerequisite:** Hashing data structure

## Open addressing

In open addressing, all the keys will be stored in the hash table itself, not by using any additional memory or extending the index(linked list). This is also known as **closed hashing** and this is done mainly based on probing. Probing can be done based on **either linear probing or quadratic probing**.

In open addressing, we keep rehashing until we resolve.

## Linear Probing

In linear probing, the rehashing process in linear. Say the location found at any step is *n* and *n* is occupied then the next attempt will be to hash at position *(n+1)*. We wrap around from the last table location to first location if necessary.

rehashing(key) = (n+1) % table size

Example:

say the keys are: {123, 124, 333, 4679,983} hash table size is 10

So,

**Step 1:** Inserting 123 in the hash map. So, location is 3 and we can map there since not occupied.

**Step 1:** Inserting 123 in the hash map. So, location is 3 and we can map there since not occupied.

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 Empty 5 Empty 6 Empty 7 Empty 8 Empty 9 Empty

**Step 2:** Inserting 124 in the hash map. So, location is 4 and we can map since not occupied.

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 124 5 Empty 6 Empty 7 Empty 8 Empty 9 Empty

**Step 3:**

Inserting 333 in the hash map. So, location is 3, but it's occupied

Next location is 4, that's occupied too.

Next location is 5 and we can map here since unoccupied. (**linear probing**)

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 124 5 333 6 Empty 7 Empty 8 Empty 9 Empty

**Step 4:** Inserting 4679 in the hash map. So, location is 9. We can map here since unoccupied.

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 124 5 333 6 Empty 7 Empty 8 Empty 9 4679

**Step 5:**

Inserting 983 in the hash map. So, location is 3.

But it's occupied and using linear probing we map it to 6.

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 124 5 333 6 983 7 Empty 8 Empty 9 4679

**Advantage and disadvantages of linear probing:**

- The advantage of linear probing is that it's easy to implement.
- The disadvantage is that it leads to
**clustering**. Due to linear probing, the collided keys form a cluster in groups and increases the**probing length**which reduces the overall efficiency.

**C++ implementation for linear probing**

//linear probing #include <bits/stdc++.h> using namespace std; void add_using_linear_probing(int hash[], int a) { //hash function h(x)=x%10 int k = a % 10; //linear probing while (true) { if (hash[k] == -1) { hash[k] = a; break; } k = (k + 1) % 10; //linear increment of probe } } int main() { //set of input numbers vector<int> arr{ 123, 124, 333, 4679, 983 }; //initialize the hash table //each entry of the hash table is a single entry int hash[10]; //size of hashtable is 10 memset(hash, -1, sizeof(hash)); //initialize with empty initially for (int a : arr) { //hashing add_using_linear_probing(hash, a); } cout << "---------using linear probing---------\n"; cout << "Hash table is:\n"; for (int i = 0; i < 10; i++) { if (hash[i] == -1) cout << i << "->" << "Empty" << endl; else cout << i << "->" << hash[i] << endl; } return 0; }

**Output:**

---------using linear probing--------- Hash table is: 0->Empty 1->Empty 2->Empty 3->123 4->124 5->333 6->983 7->Empty 8->Empty 9->4679

## Quadratic probing

The problem of clustering can be avoided by using quadratic probing. Here the rehashing is done like below,

rehashing(key) = (n+k2) % table size where, k is 1,2,3, ...

We wrap around from the last table location to the first location if necessary.

Like say hashing location initially is 3 and 3 is occupied then we will go for 3+1^{2}=4, if it’s still occupied we will go for 4+2^{2}=8. So on

For the above set of keys that we used for linear probing we will use quadratic probing.

**Step 1:** Inserting 123 in the hash map. So, location is 3 and we can map there since not occupied.

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 Empty 5 Empty 6 Empty 7 Empty 8 Empty 9 Empty

**Step 2:** Inserting 124 in the hash map. So, location is 4 and we can map since not occupied.

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 124 5 Empty 6 Empty 7 Empty 8 Empty 9 Empty

**Step 3:**

Inserting 333 in the hash map. So, location is 3, but it’s occupied.

Next location is 3+1^{2}=4, that’s occupied too.

Next location is 4+2^{2}=8 and we can map here since unoccupied. (**quadratic probing**)

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 124 5 Empty 6 Empty 7 Empty 8 333 9 Empty

**Step 4:**

Inserting 4679 in the hash map. So, location is 9.

We can map here since unoccupied.

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 124 5 Empty 6 Empty 7 Empty 8 333 9 4679

**Step 5:**

Inserting 983 in the hash map. So, location is 3.

So, location is 3, but it's occupied

Next location is 3+1^{2}=4, that's occupied too.

Next location is 4+2^{2}=8 and that's occupied too.

Next location is 8+3^{2}==17%10=7

we can map here since unoccupied. (**quadratic probing**)

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 124 5 Empty 6 Empty 7 983 8 333 9 4679

**Advantage and disadvantages of quadratic probing:**

The main advantage is here we can overcome the problem of clustering which appeared in the case of linear probing. By quadratic chances of clustering is much less.

**C++ implementation of quadratic probing**

//quadratic probing #include <bits/stdc++.h> using namespace std; void add_using_quadratic_probing(int hash[], int a) { //hash function h(x)=x%10 int k = a % 10; //quadratic probing int incr = 1; while (true) { if (hash[k] == -1) { hash[k] = a; break; } //quadratic increment of probe k = (k + int(pow(incr, 2))) % 10; incr++; } } int main() { //set of input numbers vector<int> arr{ 123, 124, 333, 4679, 983 }; //initialize the hash table //each entry of the hash table is a single entry int hash[10]; //size of hashtable is 10 memset(hash, -1, sizeof(hash)); //initialize with empty initially for (int a : arr) { //hashing add_using_quadratic_probing(hash, a); } cout << "---------using quadratic probing---------\n"; cout << "Hash table is:\n"; for (int i = 0; i < 10; i++) { if (hash[i] == -1) cout << i << "->" << "Empty" << endl; else cout << i << "->" << hash[i] << endl; } return 0; }

**Output:**

---------using quadratic probing--------- Hash table is: 0->Empty 1->Empty 2->Empty 3->123 4->124 5->Empty 6->Empty 7->983 8->333 9->4679

## Double hashing

Double hashing is the best open addressing technique to overcome clustering chances. Here we increment the probing length based on another hash function.

Say the primary hash function is *h1* and secondary hash function is *h2* to increment probing length

Then *f(key)=h1(key)+k*h2(key) where h2≠h1*

Like, first we find h1(key). If it's occupied, we will go for *h1(key)+h2(key)* where *h2(key)* is the probing length. If it's still occupied then will go for *h1(key) +2*h2(key)*, so on

We will wrap up from last of table table to beginning of table if necessary

If we do double hashing with our previous example, And *take h1(key)=key%10* and *h2(key)=number digits in key*

**Step 1:** Inserting 123 in the hash map. So, location is 3 and we can map there since not occupied.

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 Empty 5 Empty 6 Empty 7 Empty 8 Empty 9 Empty

**Step 2:** Inserting 124 in the hash map. So, location is 4 and we can map since not occupied.

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 124 5 Empty 6 Empty 7 Empty 8 Empty 9 Empty

**Step 3:**

Inserting 333 in the hash map. So, location is 3, but it's occupied.

Next location is 3+digits(3) = 6, that's occupied too.

Next location is 4+2^{2}=8 and we can map here since unoccupied. (**double hashing**)

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 124 5 Empty 6 333 7 Empty 8 Empty 9 Empty

**Step 4:**

Inserting 4679 in the hash map. So, location is 9.

We can map here since unoccupied.

Index Keys 0 Empty 1 Empty 2 Empty 3 123 4 124 5 Empty 6 333 7 Empty 8 Empty 9 4679

**Step 5:**

Inserting 983 in the hash map. So, location is 3, but it's occupied.

Next location is 3+digit(983)=6, that's occupied too.

Next location is 3+2*digit(983)=9 and that's occupied too.

Next location is 3+3*digit(983)==12%10=2

We can map here since unoccupied. (**double hashing**)

Index Keys 0 Empty 1 Empty 2 983 3 123 4 124 5 Empty 6 333 7 Empty 8 Empty 9 4679

**Advantages and disadvantages of double hashing:**

The clustering chance is very less as we are using a separate hashing function, h2(key) to increment the probing length. A new hashing function is an overhead of course.

Double hashing uses few probes than quadratic or linear probing but takes more time than those two.

**C++ implementation of Double hashing**

//double hashing #include <bits/stdc++.h> using namespace std; int digit(int a) { return to_string(a).length(); } void add_using_double_hashing(int hash[], int a) { //hash function h1(x)=x%10 //hash function h2(x)=digit(x) //for incrementing probing int k = a % 10; //double hashing int count = 1; while (true) { if (hash[k] == -1) { hash[k] = a; break; } //double hashing for incrementing prob length k = (k + count * digit(a)) % 10; count++; } } int main() { //set of input numbers vector<int> arr{ 123, 124, 333, 4679, 983 }; //initialize the hash table //each entry of the hash table is a single entry int hash[10]; //size of hashtable is 10 //initialize with empty initially memset(hash, -1, sizeof(hash)); for (int a : arr) { //hashing add_using_double_hashing(hash, a); } cout << "---------using double hashing---------\n"; cout << "Hash table is:\n"; for (int i = 0; i < 10; i++) { if (hash[i] == -1) cout << i << "->" << "Empty" << endl; else cout << i << "->" << hash[i] << endl; } return 0; }

**Output:**

---------using double hashing--------- Hash table is: 0->Empty 1->Empty 2->983 3->123 4->124 5->Empty 6->333 7->Empty 8->Empty 9->4679

TOP Interview Coding Problems/Challenges

- Run-length encoding (find/print frequency of letters in a string)
- Sort an array of 0's, 1's and 2's in linear time complexity
- Checking Anagrams (check whether two string is anagrams or not)
- Relative sorting algorithm
- Finding subarray with given sum
- Find the level in a binary tree with given sum K
- Check whether a Binary Tree is BST (Binary Search Tree) or not
- 1[0]1 Pattern Count
- Capitalize first and last letter of each word in a line
- Print vertical sum of a binary tree
- Print Boundary Sum of a Binary Tree
- Reverse a single linked list
- Greedy Strategy to solve major algorithm problems
- Job sequencing problem
- Root to leaf Path Sum
- Exit Point in a Matrix
- Find length of loop in a linked list
- Toppers of Class
- Print All Nodes that don't have Sibling
- Transform to Sum Tree
- Shortest Source to Destination Path

Comments and Discussions