Searching - Linear/ Sequential, Binary and Interpolation Tutorial.

Data Structure Searching

Searching is the process of finding element in a given list. In this process we check item is available in given list or not.

Type of searching
1. Internal search
2. External search

Internal Search: In this search, Searching is performed on main or primary memory.

External Search: In this search, Searching is performed in secondary memory.



Complexity Analysis

The Complexity analysis is used to determine, the algorithm will take amount of resources (such as time and space) necessary to execute it.

There are two types of complexities: 1) Time Complexity and 2) Space Complexity.

Searching Techniques

There are basically three types of searching techniques:

  • Linear or sequential search
  • Binary search
  • Interpolation search

Linear/ Sequential Search

Linear/Sequential searching is a searching technique to find an item from a list until the particular item not found or list not reached at end. We start the searching from 0th index to N-1 index in a sequential manner, if particular item found, returns the position of that item otherwise return failure status or -1.

Linear Searching - Data Structure Tutorial
    LINEAR_SEARCH (LIST, N, ITEM, LOC, POS)
    In this algorithm, LIST is a linear array with N elements, 
    and ITEM is a given item to be searched. This algorithm finds 
    the location LOC of ITEM into POS in LIST, or sets POS := 0, 
    if search is unsuccessfull.
    1.	Set POS  :=  0;
    2.	Set LOC  :=  1;
    3.	Repeat STEPS a and b while LOC <= N
        a.	If ( LIST[LOC] = ITEM) then 
            i.	SET POS := LOC;  [get position of item]
            ii.	return;
        b.	Otherwise
            i.	SET LOC := LOC + 1 ; [increment counter]
    4.	[END OF LOOP]
    5.	return 

Complexity of Linear search

The complexity of search algorithm is based on number of comparisons C, between ITEM and LIST [LOC]. We seek C (n) for the worst and average case, where n is the size of the list.

Worst Case: The worst case occurs when ITEM is present at the last location of the list, or it is not there at al. In either situation, we have
C (n) = n
Now, C (n) = n is the worst case complexity of linear or sequential search algorithm.

Average case: In this case, we assume that ITEM is there in the list, and it can be present at any position in the list. Accordingly, the number of comparisons can be any of the numbers 1, 2, 3, 4 …n. and each number occurs with probability p = 1/n. Then,
C (n) = n/2

Implementation of Linear/ Sequential Search

            
#include <stdio.h>

/*Function to search item from list*/
int LinearSearch(int ele[], int item)
{
	int POS = -1;
	int LOC =  0;

	for( LOC = 0; LOC < SIZE; LOC++ )
	{
		if( ele[LOC] == item )
		{
			POS = LOC;
			break;
		}
	} 	

	return POS;
}

int main()
{
	int ele[SIZE];
	int i = 0;
	int item;
	int pos;

	printf("\nEnter Items : \n");

	for( i = 0; i < SIZE; i++)
	{
		printf("Enter ELE[%d] : ",i+1);
		scanf("%d",&ele[i]);
	}

	printf("\n\nEnter Item To Be Searched : ");
	scanf("%d",&item);

	pos = LinearSearch(ele,item);

	if( pos >= 0 )
	{
		printf("\nItem Found At Position : %d\n",pos+1);
	} 
	else
	{
		printf("\nItem Not Found In The List\n");
	}
	
	return 0;
}

    First run:

    Enter Items : 
    Enter ELE[1] : 20
    Enter ELE[2] : 10
    Enter ELE[3] : 30
    Enter ELE[4] : 40
    Enter ELE[5] : 50


    Enter Item To Be Searched : 30

    Item Found At Position : 3

    Second run:

    Enter Items : 
    Enter ELE[1] : 20
    Enter ELE[2] : 10
    Enter ELE[3] : 30
    Enter ELE[4] : 40
    Enter ELE[5] : 50


    Enter Item To Be Searched : 24

    Item Not Found In The List


#include <iostream>

using namespace std;
#define SIZE 5

class Search
{
	private:
		int ele[SIZE];

	public:
		void Input();
		int  LinearSearch( int item );
};

//Input Element into the list
void Search:: Input()
{
	int i = 0;

	cout<<"\nEnter Items : \n";

	for( i = 0; i < SIZE; i++)
	{
		cout<<"Enter ELE["<< i+1<<"] : ";
		cin>>ele[i];
	}
}

//Function to search item from list
int Search:: LinearSearch(int item)
{
	int POS = -1;
	int LOC =  0;

	for( LOC = 0; LOC < SIZE; LOC++ )
	{
		if( ele[LOC] == item )
		{
			POS = LOC;
			break;
		}
	} 	

	return POS;
}

int main()
{
	int i = 0;
	int item;
	int pos;
	
	Search s = Search();

	s.Input();
	
	cout<<"\n\nEnter Item To Be Searched : ";
	cin>>item;

	pos = s.LinearSearch(item);

	if( pos >= 0 )
	{
		cout<<"\nItem Found At Position : "<< pos+1<< endl;
	} 
	else
	{
		cout<<"\nItem Not Found In The List\n";
	}
	
	return 0;
}

    First run:

    Enter Items : 
    Enter ELE[1] : 20
    Enter ELE[2] : 10
    Enter ELE[3] : 30
    Enter ELE[4] : 40
    Enter ELE[5] : 50


    Enter Item To Be Searched : 30

    Item Found At Position : 3

    Second run:

    Enter Items : 
    Enter ELE[1] : 20
    Enter ELE[2] : 10
    Enter ELE[3] : 30
    Enter ELE[4] : 40
    Enter ELE[5] : 50


    Enter Item To Be Searched : 24

    Item Not Found In The List

Binary Search

It is special type of search work on sorted list only. During each stage of our procedure, our search for ITEM is reduced to a restricted segment of elements in LIST array. The segment starts from index LOW and spans to HIGH.

LIST [LOW], LIST [LOW+1], LIST [LOW+2], LIST [LOW+ 3]….. LIST[HIGH]

The ITEM to be searched is compared with the middle element LIST[MID] of segment, where MID obtained as:
MID = ( LOW + HIGH )/2

We assume that the LIST is sorted in ascending order. There may be three results of this comparison:

  1. If ITEM = LIST[MID], the search is successful. The location of the ITEM is LOC := MID.
  2. If ITEM < LIST[MID], the ITEM can appear only in the first half of the list. So, the segment is restricted by modifying HIGH = MID – 1.
  3. If ITEM > LIST[MID], the ITEM can appear only in the second half of the list. So, the segment is restricted by modifying LOW = MID + 1.
Initially, LOW is the start index of array, and HIGH is the last index of array.
The comparison goes on. With each comparison, low and high approaches near to each other. The loop will continue till LOW < HIGH.

Binary Searching - Data Structure Tutorial

    Let the item to be searched is: 15
    Step1: Initially
        1.	List[low]  = list[0] = 12
        2.	List[high] = list[9] = 99
        3.	Mid= (low + high)/2 = (0+9)/2 = 4
        4.	List[mid]=  list[4] = 35
    Step2: Since item (15) < list [mid] (35); high = mid -1 = 3
        1.	List[low]  = list[0] = 12
        2.	List[high] = list[3] = 28
        3.	Mid =(low + high)/2 = (0+3)/2 = 1
        4.	List [mid] = list [1] = 15.
    Step3: Since item (15) = list [mid] (15); 
    It is success, the location 1 is returned. 
    BINARY SEARCH (LIST, N, ITEM, LOC, LOW, MID, HIGH)
    Here LIST is a sorted array with size N, ITEM is the given information. 
    The variable LOW, HIGH and MID denote, respectively,
    the beginning, end and the middle of segment of LIST. 
    The algorithm is used to find the location LOC of ITEM in LIST array. 
    If ITEM not there LOC is NULL.

    1.	Set LOW := 1; LOW := N;
    2.	Repeat step a  to d while LOW <= HIGH and ITEM != LIST(MID)
        a.	Set MID := (LOW + HIGH)/2
        b.	If ITEM = LIST(MID), then
            i.	Set LOC := MID
            ii.	Return
        c.	If ITEM < LIST(MID) then
            i.	Set HIGH := MID - 1;
        d.	Otherwise
            i.	Set LOW := MID + 1;
    3.	SET  LOC := NULL
    4.	return

Complexity of Binary search

The complexity measured by the number f(n) of comparisons to locate ITEM in LIST where LIST contains n elements. Each comparison reduces the segment size in half. Hence, we require at most f(n) comparisons to locate ITEM, where:
2c >= n
Approximately, the time complexity is equal to log2n. It is much less than the time complexity of linear search.

Implementation of Binary Search


#include <stdio.h>

#define SIZE 5
/*To search item from sorted list*/
int BinarySearch( int ele[], int item )
{
	int POS  = -1 	;
	int LOW  =  0 	;
	int HIGH =  SIZE;
	int MID  =  0 	;

	while( LOW <= HIGH )
	{
		MID = (LOW + HIGH)/2;

		if( ele[MID] == item )
		{
			POS = MID ;
			break;
		}
		else if( item > ele[MID] )
		{
			LOW = MID + 1;
		}
		else
		{
			HIGH = MID - 1;
		}
	}	

	return POS;
}


int main()
{
	int ele[SIZE];
	int i = 0;
	int item;
	int pos;

	printf("\nEnter Items : \n");

	for( i = 0; i < SIZE; i++)
	{
		printf("Enter ELE[%d] : ",i+1);
		scanf("%d",&ele[i]);
	}

	printf("\n\nEnter Item To Be Searched : ");
	scanf("%d",&item);

	pos = BinarySearch(ele,item);

	if( pos >= 0 )
	{
		printf("\nItem Found At Position : %d\n",pos+1);
	} 
	else
	{
		printf("\nItem Not Found In The List\n");
	}

	return 0;
}
    First run:

    Enter Items : 
    Enter ELE[1] : 10
    Enter ELE[2] : 20
    Enter ELE[3] : 30
    Enter ELE[4] : 40
    Enter ELE[5] : 50


    Enter Item To Be Searched : 30

    Item Found At Position : 3

    Second run:

    Enter Items : 
    Enter ELE[1] : 10
    Enter ELE[2] : 20
    Enter ELE[3] : 30
    Enter ELE[4] : 40
    Enter ELE[5] : 50


    Enter Item To Be Searched : 24

    Item Not Found In The List


#include <iostream>

using namespace std;
#define SIZE 5

class Search
{
	private:
		int ele[SIZE];

	public:
		void Input();
		int  BinarySearch( int item );
};

//Input Element into the list
void Search:: Input()
{
	int i = 0;

	cout<<"\nEnter Items : \n";

	for( i = 0; i < SIZE; i++)
	{
		cout<<"Enter ELE["<< i+1<<"] : ";
		cin>>ele[i];
	}
}

//To search item from sorted list
int Search:: BinarySearch( int item )
{
	int POS  = -1 	;
	int LOW  =  0 	;
	int HIGH =  SIZE;
	int MID  =  0 	;

	while( LOW <= HIGH )
	{
		MID = (LOW + HIGH)/2;

		if( ele[MID] == item )
		{
			POS = MID ;
			break;
		}
		else if( item > ele[MID] )
		{
			LOW = MID + 1;
		}
		else
		{
			HIGH = MID - 1;
		}
	}	

	return POS;
}


int main()
{
	int i = 0;
	int item;
	int pos;
	
	Search s = Search();

	s.Input();
	
	cout<<"\n\nEnter Item To Be Searched : ";
	cin>>item;

	pos = s.BinarySearch(item);

	if( pos >= 0 )
	{
		cout<<"\nItem Found At Position : "<< pos+1<< endl;
	} 
	else
	{
		cout<<"\nItem Not Found In The List\n";
	}
	
	return 0;
}

    First run:

    Enter Items : 
    Enter ELE[1] : 10
    Enter ELE[2] : 20
    Enter ELE[3] : 30
    Enter ELE[4] : 40
    Enter ELE[5] : 50


    Enter Item To Be Searched : 30

    Item Found At Position : 3

    Second run:

    Enter Items : 
    Enter ELE[1] : 10
    Enter ELE[2] : 20
    Enter ELE[3] : 30
    Enter ELE[4] : 40
    Enter ELE[5] : 50


    Enter Item To Be Searched : 24

    Item Not Found In The List

Difference between Linear Search and Binary Search

Linear Search Binary Search
Sorted list is not required. Sorted list is required.
It can be used in linked list implementation. It cannot be used in liked list implementation.
It is suitable for a list changing very frequently. It is only suitable for static lists, because any change requires resorting of the list.
The average number of comparison is very high. The average number of comparison is relatively slow.

Interpolation Search

This technique is used if the items to be searched are uniformly distributed between the first and the last location. This technique is a simple modification in the binary search when MID is calculated.

Mid = low + (high – low) * ((item – LIST[low]) / (LIST[high] – LIST[low]));

Advantages

  1. If the items are uniformly distributed, the average case time complexity is log2(log2(n)).
  2. It is considered improvement in binary search.
Disadvantages
  1. The calculation of mid is complicated. It increases the execution time.
  2. If the items are not uniformly distributed, interpolation search will have very poor behavior.