KMP

  • 能够在线性时间内判定字符串\(\large A\)是否为字符串\(\large B\)的子串,并求出\(\large A\)\(\large B\)中各次出现的位置。

  • KMP算法分为两步:

    ①对字符串\(\large A\)进行自我匹配,求出一个数组\(\large next\),其中\(\large next[i]\)表示"\(A\)中以\(i\)结尾的非前缀子串"与"\(A\)的前缀"能够匹配的最长长度,即:\(\large next[i] = max\{j\}\),其中\(\large j < i\)并且\(\large A[i - j + 1 \sim i] = A[1 \sim j]\)

    特别地,当不存在这样的\(\large j\)时,令\(\large next[i]\) = 0.

    ②对字符串\(\large A,B\)进行匹配,求出一个数组\(\large f\),其中\(\large f[i]\)表示"\(\large B\)中以\(\large i\)结尾的子串"与"\(\large A\)的前缀"能够匹配的最长长度,即:\(\large f[i] = max\{j\}\),其中\(\large j <= i\)并且\(\large B[i - j + 1 \sim i] = A[1 \sim j]\)

    \(\large next\)数组的计算方法:

    Ⅰ 初始化\(\large next[1] = j = 0\),假设\(\large next[1 \sim i - 1]\)已经求出,下面求解\(\large next[i]\).

    Ⅱ 不断尝试扩展匹配长度\(\large j\),如果扩展失败(下一个字符不相等),令\(\large j = next[j]\),直至\(\large j = 0\)(应该重新从头开始匹配)。

    Ⅲ 如果扩展成功,匹配长度\(\large j++\). \(\large next[i]\)的值就是\(\large j\).

next[1] = 0;
for (int i = 2,j = 0;i <= n;i++)
{
	while (j > 0 && a[i] != a[j + 1])
		j = next[j];
	if (a[i] == a[j + 1])
		j++;
	next[i] = j;
}

\(\large f\)数组同理可得。

for (int i = 1,j = 0;i <= m;i++)
{
while (j > 0 && (j == n || b[i] != a[j + 1]))
		j = next[j];
	if (b[i] == a[j + 1])
		j++;
	f[i] = j;
	//当A在B中某一次出现时,有: f[i] == n
}
posted @ 2021-11-10 22:48  Carlotta24  阅读(27)  评论(0)    收藏  举报