ADVERTISEMENT
ADVERTISEMENT

NAJPF - Pattern Find

NAJPF - Pattern Find: You are given a string and a pattern. You find the pattern on the given string. The problem has been featured in interview/coding rounds of many tech companies such as Amazon, Accolite, MakemyTrip, etc.
Submitted by Divyansh Jaipuriyar, on April 19, 2021

Problem Statement:

You are given a string and a pattern. You find the pattern on the given string. If found print how many times found the pattern and their index. Otherwise, print ‘Not Found’.

Problem Description:

The given problem, wants you to find those indices from the first string from where if we find the substring of length equal to the second string then both the substring and second string are equal. If it is not possible to find any of the indices then you are asked to print "Not Found".

Input: The input line consists of a number of T test cases. Each test case has two strings A and B. Here |A|>|B|.

Output: For each case print the number (found pattern from the given string) next line their position. Otherwise, print 'Not Found'. There will a blank line between the two cases.

Example:

Input:
T=1
A=ababab B=ab

Output:
3
1 3 5

Explanation: 
Since the pattern "ab" occurs at indices 1,3 and 5 so size is 3.

Solution Approach:

(1) Brute Force Approach:

In this approach, we will generate substring of length equal to the pattern and then match each character of the substring of with the pattern, if the substring matches with pattern then we add that index with our answer vector.

  • Time Complexity for the above approach is: O(n*m)
  • Space Complexity for the above approach is: O(n)

Program:

#include <bits/stdc++.h>
using namespace std;

typedef long long ll;

int main()
{
    ll t;
    cout << "Enter number of test cases: ";
    cin >> t;

    while (t--) {
        cout << "Enter string A and B: ";
        string A, B;
        cin >> A >> B;

        ll n = A.length(); //string A length in variable n.
        ll m = B.length(); //string B length in variable m.

        string str; //for substring of length m.

        vector<ll> ans; //vector to store indices for solution.

        for (ll i = 0; i <= n - m; i++) {
            str = A.substr(i, m); //take substring of length m.
            ll j = 0;
            //match both string.
            for (; j < m; j++) {
                if (str[j] != B[j])
                    break;
            }
            if (j == m)
                ans.push_back(i + 1); //push the index if matched.
        }

        //check if we found any answer or not.
        if (ans.size() > 0) {
            //print solution.
            cout << ans.size() << "\n";
            for (ll i = 0; i < ans.size(); i++)
                cout << ans[i] << " ";
            cout << "\n";
        }
        else //else if not found any solution.
            cout << "Not Found"
                 << "\n";
    }
    return 0;
}

Output:

Enter number of test cases: 2
Enter string A and B: abababababa ba 
5
2 4 6 8 10 
Enter string A and B: aaaaaaaaa bbbbb
Not Found

(2) Kmp Algorithm Approach:

In this approach, we will use Knuth Morris Pratt Algorithm. To use Kmp algorithm, we need to construct the longest prefix and suffix array i.e., lps array, to construct lps array we will use the following approach:

Here, we will use an array that represents the length of the longest proper prefix which is also equal to the suffix of the array.

Following are the steps that are involved in this approach:

  1. Here, we use lps[] array which stores the length of the longest proper prefix which is equal to the suffix, each index of lps represents the current index lps upto that index comparison. Initialize lps[0]=0, since it is not possible to have proper prefix-suffix with one letter then we declare two variable j and i by which we compare the characters.
    We can get lps[] array with the help of two-variable len and i but to avoid confusion i used j variable here.
  2. We declare a function longestprefixsuffix() which will take string str as its parameter, len variable inside it will store the length of the given string, then we create lps[len] array with lps[0]=0 as explained above, then we initialize index i to 1 and j to 0.
  3. We iterate i from 1 to len -1 and during each iteration, we compare str[i] with str[j] and check if both are equal or not. If equal then we increment j by one and assign lps[i] to j and increment i. Else if they are not equal then we search for that index whose character matches with the current character at index i.
  4. Now After the formation of lps array we need to perform the following operation:
    1. With the help of lps array we avoid naive pattern matching method, each time we use to increment one character at a time for comparison but here we will not match the character that we already know.
    2. We take the window of size of the pattern and start the comparison with the index starting from 0, i.e., j=0 with B[j] and A[i] where i begins from 0.
    3. We will match character of both strings {(A[i]==B[j])i++,j++} until we confront some mismatch.
    4. When we confront a mismatch, We know that characters A[0..j-1] match with B[i-j…i-1] (here j starts with 0 and increment it only when there is a match).
      We also know that lps[j-1] is the count of characters of B[0…j-1] that are both proper prefix and suffix, thus we can conclude that we do not need to match these lps[j-1] characters with A[i-j…i-1] because we know that these characters will anyway match.
  • Time Complexity for the above approach is: O(n+m)
  • Space Complexity for the above approach is :O(n)

Program:

#include <bits/stdc++.h>
using namespace std;

typedef long long ll;

//longestprefixsuffix function.
vector<ll> longestprefixsuffix(string B)
{
    ll m = B.length(); //length of string.
    vector<ll> lps(m); //declare vector.
    lps[0] = 0; //single character case so length lps is 0.
    ll i = 1; //initialize index for comparison from 1.
    ll len = 0; //initial longest prefix suffix length.
    
    while (i < m) {
        //check current index character with character stored at len.
        if (B[i] == B[len]) {
            len++; //increase length by one.
            lps[i] = len; //lps for current index is equal to j
            i++; //move forward.
        }
        else //if B[i]!=B[j]
        {

            //we search for that index which character matches
            //with the current character at index i.
            if (len != 0)
                len = lps[len - 1];
            else //if j==0 then lps[i]=0.
            {
                lps[i] = 0;
                i++;
            }
        }
    }
    return lps; //return lps vector.
}

//function that will return indices of matched pattern.
vector<ll> KMP(string A, string B)
{
    ll n = A.length(); //string A length.
    ll m = B.length(); //string B length.

    //call function longestprefixsuffix for lps array.
    vector<ll> lps = longestprefixsuffix(B);

    ll i = 0; // index for A
    ll j = 0; // index for B

    //temporary vector to store indices of matched patterns.
    vector<ll> v1;

    //start pattern matching.
    while (i < n) {
        //if both strings characters matches.
        if (A[i] == B[j]) {

            //move to next character.
            i++;
            j++;
        }
        if (j == m) //if we get some substring which is equal to pattern
        {
            v1.push_back(i - j + 1); //we use +1 for 1 based indexing.
            j = lps[j - 1];
        }
        // mismatch after j matches
        else if (i < n and A[i] != B[j]) {
            // Do not match lps[0..lps[j-1]] characters,
            // they will match anyway
            if (j != 0)
                j = lps[j - 1];
            else
                i++;
        }
    }
    return v1; //return indices vector.
}

int main()
{
    ll t;
    cout << "Enter number of test cases: ";
    cin >> t;
    
    while (t--) {
        cout << "Enter strings A and B: ";
        string A, B;
        cin >> A >> B;
    
        vector<ll> v1 = KMP(A, B); //call KMP function.

        //check if we found any answer or not.
        if (v1.size() != 0) {
            //print solution.
            cout << v1.size() << "\n";
            for (auto it = v1.begin(); it != v1.end(); it++) {
                cout << *it << " ";
            }
        }
        else //else if not found any solution.
            cout << "Not Found";
        cout << "\n";
    }
    
    return 0;
}

Output:

Enter the number of test cases: 3
Enter strings A and B: abababab ab
4
1 3 5 7 
Enter strings A and B: aaaaa bb
Not Found
Enter strings A and B: aaaaaabbbb abb
1
6 

Problem source: NAJPF - Pattern Find

ADVERTISEMENT



ADVERTISEMENT



Comments and Discussions


ADVERTISEMENT

ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT

Languages: » C » C++ » C++ STL » Java » Data Structure » C#.Net » Android » Kotlin » SQL
Web Technologies: » PHP » Python » JavaScript » CSS » Ajax » Node.js » Web programming/HTML
Solved programs: » C » C++ » DS » Java » C#
Aptitude que. & ans.: » C » C++ » Java » DBMS
Interview que. & ans.: » C » Embedded C » Java » SEO » HR
CS Subjects: » CS Basics » O.S. » Networks » DBMS » Embedded Systems » Cloud Computing
» Machine learning » CS Organizations » Linux » DOS
More: » Articles » Puzzles » News/Updates

© https://www.includehelp.com some rights reserved.