Home Practice Programming Wildcard Pattern Matching using Dynamic Programming

Wildcard Pattern Matching using Dynamic Programming

Wildcard Pattern Matching – Given a text of length n and a wildcard pattern of length m, we are supposed to find whether the wildcard pattern matches the actual string.

The pattern will consist of 2 wildcard character ‘?’ and ‘*’ where –

  1. ‘?’ can match any single character
  2. ‘*’ can match any number of characters(including 0 characters).

Example –

text - nlognisacsportal

pattern - ***nlognis??sportal   ;output => match
pattern - **nlogni?*ortal       ;output => match
pattern - lognis??sportal       ;output => not-match
pattern - nlognis?spo**l        ;output => mot-match
pattern - ****nlognis??spo****  ;output => match

 

This problem of Wildcard Pattern Matching can be solved using dynamic programming because it will consist of lots of repeating subproblems. But while creating the subproblems we have to take care of three vital cases –

Case 1: If pattern[i] = ‘?’
We know that character ‘?’ can match any character, hence when we encounter the ‘?’ character, we will always consider it as a match and move to the next character of pattern and text.

Case 2: If pattern[i] == text[j] 
Since this is a match, it means our current pattern character and text character are the same, hence we will move to the next character of both the pattern and the text.
If pattern[i] != text[j], it means the pattern character doesn’t match the text character hence the wildcard pattern will never match the given text.

Case 3: If pattern[i] = ‘*’
This case is a little tricky because when this case occur we have the following two choices –

    1. We can ignore the ‘*’ character and move to the next character.
      Ex -> text = nlogn & pattern = *nlo*gn
    2. The character ‘*’ can match one or more characters of the Text. Hence we will move forward skipping one or more characters of a text.

Algorithm

n = text.length()
m = pattern.length()

Initialize, 
entire, dp[n+1][m+1] with all values false
dp[0][0] = true

//text is null and the pattern consists of '*'
//because * will match with null character.
//dp[i][0] will only be true for, consequent '*'
//characters that appear the beginning of a pattern.
for(i=1;i<m;i++)
    if(pattern[i] == '*')
        dp[i][0] = d[i-1][0]


for(i=1; i<=m; i++):
        for(j=1; j<=n; j++):

                if(pattern[j-1] == str[i-1] || str[i-1] == '?')
                    dp[i][j] = dp[i-1][j-1];
                    
                else if(str[i-1] == '*')
                    dp[i][j] = dp[i-1][j] || dp[i][j-1];
             
                else 
                    dp[i][j] = false;

Implementation

/*CPP program to find whether the wild-card
 pattern matches the given text*/

#include <bits/stdc++.h>
using namespace std;

int wildCard(string text, string pattern)
{
    //find the length of pattern and text
    int m = pattern.length();
    int n = text.length();

    //create a 2D array for storing
    //subproblems 
    bool dp[m+1][n+1];

    //initialize the entire 2D array as false
    memset(dp, false, sizeof(dp));

    //empty pattern will match with empty text
    dp[0][0] = true;

    //Consequative sequence of * at starting pattern 
    //will match the null string and it will be true. 
    
    for (int j = 1; j <= m; j++) 
        if (pattern[j-1] == '*') 
            dp[j][0] = dp[j-1][0];

    //filling up the dp table in botton-up fashion.
    for(int i=1; i<=m; i++)
    {
        for(int j=1; j<=n; j++)
        {
                /*If current pattern character matches the text character
                  or the '?' character, then it is a match and if the pattern
                  sting till now matches the text, current table will be filled true*/
                
                if(text[j-1] == pattern[i-1] || pattern[i-1] == '?')
                    dp[i][j] = dp[i-1][j-1];
                
                /* If we see a '*' then, 
                  1)We ignore ‘*’ character and move 
                    to next  character in the pattern, 
                        dp[i-1][j]
                  2)'*' character matches with ith char in input
                        dp[i][j-1]*/ 
                
                else if(pattern[i-1] == '*')
                    dp[i][j] = dp[i-1][j] || dp[i][j-1];
                
                // If characters don't match 
                else 
                    dp[i][j] = false;
        }
    }
    return dp[m][n];
}   

/* Main program*/
int main()
{

    string pattern, text;

    cin>>text;
    cin>>pattern;

    int res = wildCard(text, pattern);

    if(res == 1)
        cout<<"match"<<endl;
    else
        cout<<"not-match"<<endl;
   
   return 0;
}


Output

Input 
  text = nlognisacsportal
  pattern = ***nlognis??sportal
Output => match

Time Complexity

The rum time complexity of implementing wildcard pattern matching using dynamic programming is O(N x M), where N is the length of the text and M is the length of the pattern. The Auxiliary space is also O(N x M).

LEAVE A REPLY

Please enter your comment!
Please enter your name here