# hdu 4117 -- GRE Words (AC自动机+线段树)

problem

Recently George is preparing for the Graduate Record Examinations (GRE for short). Obviously the most important thing is reciting the words.
Now George is working on a word list containing N words.
He has so poor a memory that it is too hard for him to remember all of the words on the list. But he does find a way to help him to remember. He finds that if a sequence of words has a property that for all pairs of neighboring words, the previous one is a substring of the next one, then the sequence of words is easy to remember.
So he decides to eliminate some words from the word list first to make the list easier for him. Meantime, he doesn't want to miss the important words. He gives each word an importance, which is represented by an integer ranging from -1000 to 1000, then he wants to know which words to eliminate to maximize the sum of the importance of remaining words. Negative importance just means that George thought it useless and is a waste of time to recite the word.
Note that although he can eliminate any number of words from the word list, he can never change the order between words. In another word, the order of words appeared on the word list is consistent with the order in the input. In addition, a word may have different meanings, so it can appear on the list more than once, and it may have different importance in each occurrence.

Input

The first line contains an integer T(1 <= T <= 50), indicating the number of test cases.
Each test case contains several lines.
The first line contains an integer N(1 <= N <= 2 * 10 4), indicating the number of words.
Then N lines follows, each contains a string S i and an integer W i, representing the word and its importance. S i contains only lowercase letters.
You can assume that the total length of all words will not exceeded 3 * 10 5.

Output

For each test case in the input, print one line: "Case #X: Y", where X is the test case number (starting with 1) and Y is the largest importance of the remaining sequence of words.

Sample Input

1
5
a 1
ab 2
abb 3
baba 5
abbab 8

Sample Output

Case #1: 14思路：AC自动机+线段树。这题直观的想法是以每个字符串为尾串，求在当前串之前选取字符串并以当前串为结束得到的最大值，即dp[i]=max(dp[j])+w[i],s[j]是s[i]的子串且j<i；我们可以反向建立fail树，那么对于串s[i]的最后一位指向的孩子，均是包含s[i]的串，所以以s[i]的最后一位为根节点的子树中的孩子节点表示的串均包含s[i]，那么我们对串s[1]~s[n]逐一进行计算时，可以把结果用线段树更新到子树中，然后对于每个串计算时，只需要考虑s[i]的每一位能得到的最大值（线段树单点查询），最后取最大值加上w[i]，即为s[i]的最大值，然后更新到子树中。代码如下：
#include <cstdio>
#include <cstring>
#include <algorithm>
#include <iostream>
#include <queue>
using namespace std;
typedef long long LL;
const int N=2e4+5;
const int M=3e5+5;
char s[M];
int w[N],pos[N];

struct Node
{
int son[26];
}node[M];
int fail[M];
int root,tot;

struct Edge
{
int to;
int next;
}edge[M];

int in[M],out[M],tp;  ///树节点序号化;

int tx[M*4],tf[M*4],L,R,tmp; ///线段树;

inline int newnode()
{
tot++;
memset(node[tot].son,0,sizeof(node[tot].son));
fail[tot]=0;
}

{
edge[cnt].to=v;
}

inline void insert(char s[])
{
int now=root;
for(int i=0;s[i];i++)
{
if(!node[now].son[s[i]-'a'])
node[now].son[s[i]-'a']=newnode();
now=node[now].son[s[i]-'a'];
}
}
inline void build()
{
queue<int>Q;
Q.push(root);
while(!Q.empty())
{
int now=Q.front(); Q.pop();
for(int i=0;i<26;i++)
{
if(node[now].son[i])
{
if(now!=root)
fail[node[now].son[i]]=node[fail[now]].son[i];
Q.push(node[now].son[i]);
}
else node[now].son[i]=node[fail[now]].son[i];

}
}
}
inline void dfs(int now)
{
in[now]=++tp;
{
dfs(edge[i].to);
}
out[now]=tp;
}
inline void pushdown(int i)
{
if(!tf[i]) return ;
int pre=tf[i];
tf[i<<1]=max(tf[i<<1],pre);
tf[i<<1|1]=max(tf[i<<1|1],pre);
tx[i<<1]=max(tx[i<<1],pre);
tx[i<<1|1]=max(tx[i<<1|1],pre);
tf[i]=0;
}
inline int query(int l,int r,int i)
{
if(l==r) return tx[i];
int mid=(l+r)>>1;
pushdown(i);
if(L<=mid) return query(l,mid,i<<1);
else return query(mid+1,r,i<<1|1);
}
inline void update(int l,int r,int i)
{
if(L<=l&&r<=R)
{
tf[i]=max(tf[i],tmp);
tx[i]=max(tx[i],tmp);
return ;
}
int mid=(l+r)>>1;
pushdown(i);
if(L<=mid) update(l,mid,i<<1);
if(R>mid)  update(mid+1,r,i<<1|1);
tx[i]=max(tx[i<<1],tx[i<<1|1]);
}

void init()
{
tot=-1;
cnt=1;
tp=0;
root=newnode();
memset(fail,0,sizeof(fail));
memset(tx,0,sizeof(tx));
memset(tf,0,sizeof(tf));
}
int main()
{
int T,Case=1; scanf("%d",&T);
while(T--)
{
init();
int n; scanf("%d",&n);
for(int i=1;i<=n;++i)
{
scanf("%s%d",s+pos[i-1],w+i);
pos[i]=pos[i-1]+strlen(s+pos[i-1]);
insert(s+pos[i-1]);
}
build();
dfs(root);
int ans=0;
for(int i=1;i<=n;++i)
{
tmp=0;
int now=root;
for(int j=pos[i-1];j<pos[i];++j)
{
now=node[now].son[s[j]-'a'];
L=in[now]; R=in[now];
int res=query(1,tp,1);
tmp=max(tmp,res);
}
tmp+=w[i];
ans=max(ans,tmp);
L=in[now];
R=out[now];
update(1,tp,1);
}
printf("Case #%d: %d\n",Case++,ans);
}
return 0;
}

posted @ 2017-10-26 22:40  茶飘香~  阅读(...)  评论(...编辑  收藏