# Binary Search in String

**Binary Search in String**: In this article, we have discussed how we can use binary search to find a word from a dictionary (Sorted list of words)?

Submitted by Radib Kar, on July 23, 2020

**Prerequisite:**

**Binary search** is one of the most popular algorithms which searches a key in a sorted range in logarithmic time complexity.

## Binary search in string

Earlier in this article, we saw that we can use binary search to find a key in a sorted range. We discussed based on integer element. In this article, we are going to discuss based on a **sorted list of strings.**

The idea is exactly similar, the similar the only difference is the data type which is string now to represent the words and the sorted list of words is known as a **dictionary**.

For example,

The sorted list of words be: ["bad", "blog", "coder", "coding", "includehelp", "india"] And, the key to be searched is: "dog" then answer would be "Not Found" If the key to be searched is "coding" then index at where the word found is : 3 (0-based)

We know that binary search works on a method that finds a pivot and then compares the pivot element with the key and based on the comparison result it continues to search either in the left half or in the right half only and ultimately terminates.

**So the thing is how string comparison works?**

Say we have two string a and string b. How to decide which string is greater and which one is lesser. To compare strings we use lexicographical ordering. Between two strings a & b, which one has less ASCII valued character at first is counted to be the smaller one. For example, the smaller string between **"d"** & **"abc"** is **"abc"**. Another situation can arrive when one string is prefix string of the other, then, of course, the prefix string would be the smaller one. For example, the smaller one between **"abc"** & **"abcd"** would be **"abc"**.

Below is the detailed algorithm to search a word in a sorted list of words using a binary search.

If the input list is not sorted we need to sort ourselves, otherwise, the binary search will fail.

Let's work on the above example to describe the binary search:

Wordlist: ["bad", "blog", "coder", "coding", "includehelp", "india"] key to be searched= "coding"

The basic steps behind the binary search are to first divide the range into **two (that's why binary)** half based on a pivot. **How will we choose the pivot?**

**We will pick the mid element as our pivot**

To find the *mid* element simple do *mid=(left+right)/2* where *left* is the start index of the current range and *right* is end index of the current range.

Now we need to check whether the search key is the same as the pivot element or not. If it's the same then, we are done. We found the key. If it's not the same then there can be two cases,

//check how comparison works for string**key> pivot element:**

In this case, we need to check only the right half of the range. Right half means the elements which are greater than the pivot. This is possible only because the list is sorted. Since the word list is sorted it's guaranteed that search key will not appear in the left half as it's greater than the pivot. So we can discard the left half and shrink our range to [pivot index+1, right] for further search.//check how comparison works for string**key< pivot element:**

In this case, we need to check only the left half of the range. Left half means the word elements which are less than the pivot. This is possible only because the word list is sorted. Since the word list is sorted it's guaranteed that search key will not appear in the right half as it's less than the pivot. So we can discard the right half and shrink our range to [left, pivot index-1] for further search.

So every time,

- We will find the pivot
*index=(left+ right)/2* - We will check whether the pivot element is key or not, if it's the key then terminate as the key is found. Otherwise, shrink the range and update left or right as per choices discussed above
- Continue until range collapse down to
*0(left>right)*

Below is the dry run with the algorithm:

Iteration 1:Initially, the range is["bad", "blog", "coder", "coding", "includehelp", "india"], key="coding" So left=0, right= 5 Pivot index is (0+5)/2=2, so pivot is "coder" Now pivot < "coding"(key) So we need to search the right half only, hence left=pivot index+1=3 Thus the searching range now: ["coding", "includehelp", "india"] ----------------------------------------Iteration 2:Now, the range is ["coding", "includehelp", "india"], key=" coding" So left=3, right= 5 Pivot index is (3+5)/2=4, so pivot is "includehelp" Now pivot >"coding" (key) So we need to search the left half only, hence right=pivot index-1=3 Thus the searching range now: ["coding", "includehelp", "india"] -----------------------------------------Iteration 3:Now, the range is ["coding"], key="coding" So left=3, right= 3 Pivot index is (3+3)/2=3, so pivot is "coding" Now pivot =="coding" (key) Terminate the search and return pivot index

**Time complexity:**

Time complexity is the same as binary search which is logarithmic, O(log_{2}n). This is because every time our search range becomes half.

- So, T(n)=T(n/2)+1(time for finding pivot)
- Using the master theorem you can find T(n) to be Log
_{2}n. Also, you can think this as a series of n/2+n/4+n/8+n/16+….+1 which is Log_{2}(n)

**Better way to find the pivot index:**

We were finding the pivot index like ** (left+ right)/2**. But one thing to notice that (left+ right) has a chance to lead to integer overflow. Hence a better method is to find the pivot index like below:

Pivot index= left+ (right-left)/2

**C++ implementation:**

#include <bits/stdc++.h> using namespace std; //iterative binary search int binary_search_iterative(vector<string> arr, string key) { int left = 0, right = arr.size(); while (left <= right) { int mid = left + (right - left) / 2; if (arr[mid] == key) return mid; else if (arr[mid] < key) left = mid + 1; else right = mid - 1; } return -1; } //recursive binary search int binary_search_recursive(vector<string> arr, string key, int left, int right) { if (left > right) return -1; int mid = left + (right - left) / 2; if (arr[mid] == key) return mid; else if (arr[mid] < key) return binary_search_recursive(arr, key, mid + 1, right); return binary_search_recursive(arr, key, left, mid - 1); } //to print void print(vector<string>& a) { for (auto it : a) cout << it << " "; cout << endl; } int main() { cout << "Enter number of words you want to enter for the word list\n"; int n; cin >> n; vector<string> arr(n); cout << "Enter the words\n"; for (int i = 0; i < n; i++) { cin >> arr[i]; } cout << "Enter searching key\n"; //key there string key; cin >> key; cout << "Sorting the input list to ensure binary search works\n"; sort(arr.begin(), arr.end()); cout << "Printing the sorted word list\n"; print(arr); clock_t tStart1 = clock(); int index = binary_search_iterative(arr, key); if (index == -1) cout << key << " not found\n"; else cout << key << " found at index(0 based): " << index << endl; clock_t tend1 = clock(); printf("Time taken in iterative binary search: %.6fs\n", (double)(tend1 - tStart1) / CLOCKS_PER_SEC); clock_t tStart2 = clock(); index = binary_search_recursive(arr, key, 0, n - 1); if (index == -1) cout << key << " not found\n"; else cout << key << " found at index(0 based): " << index << endl; clock_t tend2 = clock(); printf("Time taken in recursive binary search: %.6fs\n", (double)(tend2 - tStart2) / CLOCKS_PER_SEC); return 0; }

**Output:**

Enter number of words you want to enter for the word list 6 Enter the words includehelp india bad coding coder blog Enter searching key coding Sorting the input list to ensure binary search works Printing the sorted word list bad blog coder coding includehelp india coding found at index(0 based): 3 Time taken in iterative binary search: 0.000023s coding found at index(0 based): 3 Time taken in recursive binary search: 0.000007s

TOP Interview Coding Problems/Challenges

- Run-length encoding (find/print frequency of letters in a string)
- Sort an array of 0's, 1's and 2's in linear time complexity
- Checking Anagrams (check whether two string is anagrams or not)
- Relative sorting algorithm
- Finding subarray with given sum
- Find the level in a binary tree with given sum K
- Check whether a Binary Tree is BST (Binary Search Tree) or not
- 1[0]1 Pattern Count
- Capitalize first and last letter of each word in a line
- Print vertical sum of a binary tree
- Print Boundary Sum of a Binary Tree
- Reverse a single linked list
- Greedy Strategy to solve major algorithm problems
- Job sequencing problem
- Root to leaf Path Sum
- Exit Point in a Matrix
- Find length of loop in a linked list
- Toppers of Class
- Print All Nodes that don't have Sibling
- Transform to Sum Tree
- Shortest Source to Destination Path

Comments and Discussions

**Ad:**
Are you a blogger? Join our Blogging forum.