Wildcard Matching

2014.2.28 01:49

Implement wildcard pattern matching with support for '?' and '*'.

'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "*") → true
isMatch("aa", "a*") → true
isMatch("ab", "?*") → true
isMatch("aab", "c*a*b") → false

Solution:

  At first I tried to implement an O(n * m) solution with dynamic programming and a 2d array, but it proved to be neither efficient nor easy-to-write. I gave up.

  '*' is the key point in wildcard matching, because you can skip arbitrary number of letters when '*' is encountered, while for other letter or '?' you always go one step forward.

  Let's think about what you do when a mismatch happens: s[]="abcde" p[]="abcf". You can't just return false, because '*' means more possibilities.

  See this: s[]="abcxxadc" p="abc*abc". The bold part is the longest match, and the italic is unabled to be matched.

  If you find a letter in p is mismatched, you can seek the last '*' and start searching from the next letter to that '*'. Consecutive '*'s are regarded as one.

  Why is this backtracking correct? Because you can cover those consecutive mismatched letters with one '*', as long as there is one '*' available. Those non-star letters have to be strictly matched.

  The algorithm requires the pattern to be completely matched, so partial match is considered mismatch.

  Total time complexity is O(len(s) + len(p)). Space complexity is O(1).

Accepted code:

 1 // 2CE, 4WA, 1AC, O(m + n) solution, not so easy to understand.
 2 #include <cstring>
 3 using namespace std;
 4 
 5 class Solution {
 6 public:
 7     bool isMatch(const char *s, const char *p) {
 8         if (s == nullptr || p == nullptr) {
 9             return false;
10         }
11         
12         int ls, lp;
13         
14         ls = strlen(s);
15         lp = strlen(p);
16         
17         if (ls == lp && lp == 0) {
18             return true;
19         }
20         
21         if (lp == 0) {
22             return false;
23         }
24         
25         // from here on, ls and lp are guaranteed to be non-zero.
26         int i, j;
27         int last_star_p;
28         int last_star_s;
29         
30         i = j = 0;
31         last_star_p = -1;
32         last_star_s = 0;
33         while (j < ls) {
34             if (p[i] == '?' || p[i] == s[j]) {
35                 ++i;
36                 ++j;
37             } else if (p[i] == '*') {
38                 last_star_p = i;
39                 ++i;
40                 last_star_s = j;
41             } else if (last_star_p != -1) {
42                 // backtrack to the last '*', and move to the next letter in s
43                 i = last_star_p + 1;
44                 j = last_star_s + 1;
45                 ++last_star_s;
46             } else {
47                 return false;
48             }
49         }
50         while (p[i] == '*') {
51             // skip the trailing stars
52             ++i;
53         }
54         
55         return i == lp;
56     }
57 };

 

 posted on 2014-02-28 02:06  zhuli19901106  阅读(810)  评论(0编辑  收藏  举报