Home » Data Structure » Hashing Coding Problems

# Sort elements by frequency

In this article, we are going to see **how to sort an array based on frequency?** This problem is a classic problem in hashing and a variety of interview problems is based on this core concept. In this article also, we are going to see the instance of sorting a map based on its value instead of a key.

Submitted by Radib Kar, on July 05, 2020

**Prerequisite:**

- Hashing data structure
- How to write user-defined comparator for priority queue STL in C++?
- How to sort a map based on values instead of value?

**Problem statement:**

Sort an array based on frequencies. The element having maximum frequency will come first. If two elements have the same frequency then the element coming first will appear first in the sorted array two.

**Example:**

Input array: [1, 2, 3, 2, 3, 2, 1] After sorting frequency wise: [2, 2, 2, 1, 1, 3, 3] 2 has the maximum frequency thus, it came first and since 1 and 3 have the same frequency, so, 1 came first preserving the order in the input array.

**Solution:**

The idea is to create a hash table first which will store the frequency of unique elements. That can be created like below:

Hash function, *h(x)=x* but instead of storing key we will store the count

Initially, hash table is empty.

For each key in input array, Hash[key]++ End for

If we create the hash table from the above input array, it will be,

Key Frequency 1 2 2 3 3 2

So now we need to sort the map based on the value instead of the key, but for the same frequencies, we need to preserve the key order(it’s order not key itself).

To preserve the key order we require another hash table that would store the first position of the key. We will store that like below:

Say this hash table is named *first_occurrence*

For each key in input array, If first_occurence[key]==0: //not assigned first_occurence[key]=key index+1 End for

So after processing the input the hash table *first_occurrence* would be:

Key First occurrence 1 1(0+1) 2 2(1+1 3 3(2+1)

So now rest of the task is to sort the hash table based on the map and based on *first_occurrence* hash table.

We will use the priority queue to sort this map based on our defined comparator where we will implement logic to sort by frequency value and *first_occurrence* hash table.

**How to pass this in the comparator is our main challenge or area of our discussion?**

You can observe a thing that we have designed a class for this problem and kept the *first_occurrence* hash table as a data member, not any local member to a function. Thing is not the thing I do in general. So, there must be something, that made me design a class and bring the OOP concept within it. This is because I need to feed the *first_occurrence* hash table into the comparator and the comparator is a lambda function. Like below,

auto comp=[](pair<int,int> a, pair<int,int> b){ //function body } But this time if you notice the implementation, we have auto comp=[this](pair<int,int> a, pair<int,int> b){ //function body }

This helps you to pass the data members to the lambda function or it's known as **capturing of the lambda function**.

That's why we designed the class and made *first_occurrence* as data members to feed into the lambda function. Now we can perform our logic similarly as we do in comparators.

The lambda comparator would be like below which will give us a sorted map based on our requirement via the *priority_queue*.

auto comp=[this](pair<int,int> a, pair<int,int> b){ if(a.second<b.second) return true; else if(a.second>b.second) return false; else{ if(first_occurence[a.first]<first_occurence[b.first]) return false; else return true; } };

So after pushing all pairs(a map element is pair, <key, element>) we will have our priority queue as,

Key Frequency 2 3 1 2 3 2

Now, pop each element from the priority queue and append the key as many times as the frequency,

So at iteration 1 We will pop <2,3> So vector will be 2 2 2 So at iteration 2 We will pop <1,2> So vector will be 2 2 2 1 1 So at iteration 3 We will pop <3,2> So vector will be 2 2 2 1 1 3 3 Queue Is empty So our sorted array based on frequency is 2 2 2 1 1 3 3

**C++ implementation:**

#include <bits/stdc++.h> using namespace std; class Includehelp { public: unordered_map<int, int> first_occurence; void print(vector<int> arr) { for (auto i : arr) { cout << i << " "; } cout << endl; } vector<int> sort_by_frequency(vector<int> arr) { //create the hashmap unordered_map<int, int> mymap; for (int i = 0; i < arr.size(); i++) { mymap[arr[i]]++; if (first_occurence[arr[i]] == 0) first_occurence[arr[i]] = i + 1; } //now to sort based on frequency lets use priority queue //comparator of priority queue using lambda function auto comp = [this](pair<int, int> a, pair<int, int> b) { if (a.second < b.second) return true; else if (a.second > b.second) return false; else { if (first_occurence[a.first] < first_occurence[b.first]) return false; else return true; } }; priority_queue<pair<int, int>, vector<pair<int, int> >, decltype(comp)> pq(comp); for (auto& ij : mymap) { pq.push(ij); } //sorted outcome vector<int> result; while (!pq.empty()) { pair<int, int> p = pq.top(); pq.pop(); int no = p.first; int freq = p.second; for (int i = 0; i < freq; i++) result.push_back(no); } return result; } }; int main() { int n; cout << "Enter number of elements\n"; cin >> n; vector<int> arr(n, 0); cout << "Enter the elements\n"; for (int i = 0; i < n; i++) { cin >> arr[i]; } Includehelp* obj = new Includehelp(); arr = obj->sort_by_frequency(arr); cout << "Printing after sorting by frequency\n"; obj->print(arr); return 0; }

**Output:**

Enter number of elements 8 Enter the elements 1 2 3 1 2 3 4 5 Printing after sorting by frequency 1 1 2 2 3 3 4 5

Related Programs

- Minimum deletions to make all elements same using hash table
- Maximum distance between two occurrences of same element in array
- Check if two arrays are similar or not using hashing
- Minimum number of operations to make all elements same using hashing
- First element occurring K times in the array using hashing
- Count maximum points on same line
- Given an array A[] and number X, check for pair in A[] with sum X | using hashing O(n) time complexity | Set 1
- Given an array A[] and number X, check for pair in A[] with sum X | using two pointer algorithm, O(1) space complexity | Set 2
- Minimum Number of Steps to Make Two Strings Anagram

Comments and Discussions!