字典树优化

思路

原字典树采用空间换时间,用类似图论中邻接矩阵的存法用二维存储,常数更大,初始化更慢.面对需要大量初始化的题目时,初始化操作成了复杂度瓶颈.改进后采用类似链式前向星的“链表结构”.

我这里写法比链式前向星不一样的点在于不是父结点去遍历子树,而是父结点直接来到其中一个子节点,再由子节点去向其他的兄弟节点,实际可以应该改成链式前向星的那种写法,没有试过,脑测可行.

代码

\(v_i\) 表示该节点所存的字符, \(top_i\) 表示该节点的其中一个子树

\(nxt_i\) 表示该节点的兄弟节点, \(siz_i\) 表示该节点存过多少个字符串

#include<bits/stdc++.h>
#define rep(u,x,y) for(int u=x;u<=y;++u)
#define debug cout<<__LINE__<<' '<<__FUNCtION__<<'\n';
#define gc getchar()
#define pc putchar
using namespace std;
const int N=3e6+1,inf=1e9+6;
typedef long long ll;
ll read(){
	ll x=0,w=1;
	char ch=gc;
	while(ch<'0'||ch>'9')(ch=='-')&&(w=-1,0),ch=gc;
	while(ch>='0'&&ch<='9')x=(x<<3)+x+x+(ch^48),ch=gc;
	return x*w;
}
void wrt(ll x){
	if(x<0)pc('-'),x=-x;
	if(x>9)wrt(x/10);
	pc(x%10^48);
}
int t,n,q;
char s[N];
struct trie{
    int cnt,nxt[N],top[N];
    int siz[N],v[N];
    void insert(char* s){
        int p=1,len=strlen(s),x,pos;
        for(int i=0;i<len;i++){
            x=s[i],pos=top[p];
			if(x<v[pos]){
                v[++cnt]=x,top[p]=cnt;
                nxt[cnt]=pos,p=cnt;
            }
            else{
                while(v[nxt[pos]]<=x) pos=nxt[pos];
				if(v[pos]!=x){
                    v[++cnt]=x;
                    nxt[cnt]=nxt[pos];
                    nxt[pos]=cnt,p=cnt;
                }
                else p=pos;
            }
            siz[p]++;
        }
    }
    int query(char* s){
        int p=1,len=strlen(s),x,pos;
        for(int i=0;i<len;i++){
            x=s[i],pos=top[p];
            while(v[nxt[pos]]<=x) pos=nxt[pos];
            if(v[pos]!=x) return 0;
        	else p=pos;
		}
        return siz[p];
    }
    void clear(){
    	for(int i=1;i<=cnt;i++)
    		siz[i]=top[i]=0;
		v[0]=inf,cnt=1;
	}
}T;

int main(){
	for(t=read();t--;){
		n=read(),q=read(),T.clear();
		while(n--)scanf("%s",&s),T.insert(s);
		while(q--)scanf("%s",&s),printf("%d\n",T.query(s));
	}
	return 0;
}

字典树状态表

我们以插入 face, facility, faces, facer, name 为例

节点 `p`	`v[p]` (字符)	`top[p]` (子链表头)	`nxt[p]` (下一兄弟节点)	`siz[p]` (计数)	描述
0	`INF`	-	-	-	哨兵节点
1	-	13	-	-	根节点
2	`'f'`	3	0	3	路径 `f` (`face`,`facility`,`facer`)
3	`'a'`	4	0	3	路径 `fa`
4	`'c'`	6	0	3	路径 `fac`，分叉为 `i` 和 `e`
5	`'e'`	10	0	2	(`终止节点+`)路径 `face`，分叉为 `r` 和 `s`
6	`'i'`	7	5	1	路径 `faci` (`facility`)
7	`'l'`	8	0	1	路径 `facil`
8	`'i'`	9	0	1	路径 `facili`
9	`'t'`	11	0	1	路径 `facilit`
10	`'r'`	0	12	1	终止节点 (`facer`)
11	`'y'`	0	0	1	终止节点 (`facility`)
12	`'s'`	0	0	1	终止节点 (`faces`)
13	`'n'`	14	2	1	新增路径 `n` (`name`)
14	`'a'`	15	0	1	路径 `na`
15	`'m'`	16	0	1	路径 `nam`
16	`'e'`	0	0	1	终止节点 (`name`)

posted @ 2025-08-20 20:55 badn 阅读(11) 评论(0) 收藏举报

刷新页面返回顶部

badn

字典树优化

思路

代码

字典树状态表

公告