Wildcard Pattern Matching – Given a text of length n and a wildcard pattern of length m, we are supposed to find whether the wildcard pattern matches the actual string.
The pattern will consist of 2 wildcard character ‘?’ and ‘*’ where –
- ‘?’ can match any single character
- ‘*’ can match any number of characters(including 0 characters).
Example –
text - nlognisacsportal pattern - ***nlognis??sportal ;output => match pattern - **nlogni?*ortal ;output => match pattern - lognis??sportal ;output => not-match pattern - nlognis?spo**l ;output => mot-match pattern - ****nlognis??spo**** ;output => match
This problem of Wildcard Pattern Matching can be solved using dynamic programming because it will consist of lots of repeating subproblems. But while creating the subproblems we have to take care of three vital cases –
Case 1: If pattern[i] = ‘?’
We know that character ‘?’ can match any character, hence when we encounter the ‘?’ character, we will always consider it as a match and move to the next character of pattern and text.
Case 2: If pattern[i] == text[j]
Since this is a match, it means our current pattern character and text character are the same, hence we will move to the next character of both the pattern and the text.
If pattern[i] != text[j], it means the pattern character doesn’t match the text character hence the wildcard pattern will never match the given text.
Case 3: If pattern[i] = ‘*’
This case is a little tricky because when this case occur we have the following two choices –
- We can ignore the ‘*’ character and move to the next character.
Ex -> text = nlogn & pattern = *nlo*gn - The character ‘*’ can match one or more characters of the Text. Hence we will move forward skipping one or more characters of a text.
- We can ignore the ‘*’ character and move to the next character.
Algorithm
n = text.length() m = pattern.length() Initialize, entire, dp[n+1][m+1] with all values false dp[0][0] = true //text is null and the pattern consists of '*' //because * will match with null character. //dp[i][0] will only be true for, consequent '*' //characters that appear the beginning of a pattern. for(i=1;i<m;i++) if(pattern[i] == '*') dp[i][0] = d[i-1][0] for(i=1; i<=m; i++): for(j=1; j<=n; j++): if(pattern[j-1] == str[i-1] || str[i-1] == '?') dp[i][j] = dp[i-1][j-1]; else if(str[i-1] == '*') dp[i][j] = dp[i-1][j] || dp[i][j-1]; else dp[i][j] = false;
Implementation
/*CPP program to find whether the wild-card pattern matches the given text*/ #include <bits/stdc++.h> using namespace std; int wildCard(string text, string pattern) { //find the length of pattern and text int m = pattern.length(); int n = text.length(); //create a 2D array for storing //subproblems bool dp[m+1][n+1]; //initialize the entire 2D array as false memset(dp, false, sizeof(dp)); //empty pattern will match with empty text dp[0][0] = true; //Consequative sequence of * at starting pattern //will match the null string and it will be true. for (int j = 1; j <= m; j++) if (pattern[j-1] == '*') dp[j][0] = dp[j-1][0]; //filling up the dp table in botton-up fashion. for(int i=1; i<=m; i++) { for(int j=1; j<=n; j++) { /*If current pattern character matches the text character or the '?' character, then it is a match and if the pattern sting till now matches the text, current table will be filled true*/ if(text[j-1] == pattern[i-1] || pattern[i-1] == '?') dp[i][j] = dp[i-1][j-1]; /* If we see a '*' then, 1)We ignore ‘*’ character and move to next character in the pattern, dp[i-1][j] 2)'*' character matches with ith char in input dp[i][j-1]*/ else if(pattern[i-1] == '*') dp[i][j] = dp[i-1][j] || dp[i][j-1]; // If characters don't match else dp[i][j] = false; } } return dp[m][n]; } /* Main program*/ int main() { string pattern, text; cin>>text; cin>>pattern; int res = wildCard(text, pattern); if(res == 1) cout<<"match"<<endl; else cout<<"not-match"<<endl; return 0; }
Output
Input text = nlognisacsportal pattern = ***nlognis??sportal Output => match
Time Complexity
The rum time complexity of implementing wildcard pattern matching using dynamic programming is O(N x M), where N is the length of the text and M is the length of the pattern. The Auxiliary space is also O(N x M).