【原】 POJ 1035 Spell checker 编辑距离 解题报告

http://poj.org/problem?id=1035

 

方法一:动态规划  O(len^2)

EditDist

c[i,j]表示s1[0...i]和s2[0...j]的编辑距离,由于矩阵c的第一行和第一列为基准情况,所以实际中c[len1,len2]为s1和s2的编辑距离
当s1[i]==s2[j]时,自然而然的c[i,j] = c[i-1][j-1]
当s1[i]!=s2[j]时,先按照变动s1来讨论:
       删除s1[i],再观察s1[0...i-1]和s2[0...j]的编辑距离,为c[i-1,j]
       将s1[i]替换成s2[j],此时s1[i]==s2[j],再观察s1[0...i-1]和s2[0...j-1]的编辑距离,为c[i-1,j-1]
       在s1[i]后添加s[j],再观察s1[0...i]和s2[0...j-1]的编辑距离,为c[i,j-1]
此时c[i,j]即为c[i-1,j]、c[i,j-1]、c[i-1,j-1]三者中的最小值再加上一次操作
对于s2[j]的这三种操作所获得的需要计算的矩阵c中的项是相同的
因此可得如下递归式

c[i,j] = c[i-1][j-1] , if s1[i]==s2[j]
c[i,j] = min{ c[i][j-1] , c[i-1][j] ,  c[i-1][j-1] }+1 , otherwise

 

方法二:顺序扫描 O(len)

run1035()

由于只求编辑距离为1,因此可以得到线性算法

 

Description

You, as a member of a development team for a new spell checking program, are to write a module that will check the correctness of given words using a known dictionary of all correct words in all their forms.
If the word is absent in the dictionary then it can be replaced by correct words (from the dictionary) that can be obtained by one of the following operations:
?deleting of one letter from the word;
?replacing of one letter in the word with an arbitrary letter;
?inserting of one arbitrary letter into the word.
Your task is to write the program that will find all possible replacements from the dictionary for every given word.

Input

The first part of the input file contains all words from the dictionary. Each word occupies its own line. This part is finished by the single character '#' on a separate line. All words are different. There will be at most 10000 words in the dictionary.
The next part of the file contains all words that are to be checked. Each word occupies its own line. This part is also finished by the single character '#' on a separate line. There will be at most 50 words that are to be checked.
All words in the input file (words from the dictionary and words to be checked) consist only of small alphabetic characters and each one contains 15 characters at most.

Output

Write to the output file exactly one line for every checked word in the order of their appearance in the second part of the input file. If the word is correct (i.e. it exists in the dictionary) write the message: " is correct". If the word is not correct then write this word first, then write the character ':' (colon), and after a single space write all its possible replacements, separated by spaces. The replacements should be written in the order of their appearance in the dictionary (in the first part of the input file). If there are no replacements for this word then the line feed should immediately follow the colon.

Sample Input

i

is

has

have

be

my

more

contest

me

too

if

award

#

me

aware

m

contest

hav

oo

or

i

fi

mre

#

Sample Output

me is correct

aware: award

m: i my me

contest is correct

hav: has have

oo: too

or:

i is correct

fi: i

mre: more me

 

 

   1: int EditDist(const string& s1,const string& s2,int c[15+1][15+1])
   2: {
   3:     int len1 = s1.size() ;
   4:     int len2 = s2.size() ;    
   5:     int i,j ;
   6:     int strIndex1,strIndex2 ;
   7:     int d1,d2,d3 ;
   8:     int min ;
   9:  
  10:     for( i=1 ; i<len1+1 ; ++i )
  11:     {
  12:         for( j=1 ; j<len2+1 ; ++j )
  13:         {
  14:             strIndex1 = i-1 ;
  15:             strIndex2 = j-1 ;
  16:  
  17:             if( s1[strIndex1] == s2[strIndex2] )
  18:                 c[i][j] = c[i-1][j-1] ;
  19:             else 
  20:             {
  21:                 d1 = c[i-1][j-1] ;
  22:                 d2 = c[i-1][j] ;
  23:                 d3 = c[i][j-1] ;
  24:                 min = d1<d2 ? d1 : d2 ;
  25:                 min = min<d3 ? min : d3 ;
  26:                 c[i][j] = min+1 ;                
  27:             }
  28:             //cout<<c[i][j]<<" ";
  29:         }
  30:         //cout<<endl;
  31:     }
  32:     return c[len1][len2] ;
  33: }

 

   1:  
   2: bool check( const string& s1 , const string& s2 )
   3: {
   4:     int len1 = s1.size() ;
   5:     int len2 = s2.size() ;
   6:     int i,j ;
   7:     if( len1 == len2 )  //replace
   8:     {
   9:         i = 0 ;
  10:         while( i<len1 && s1[i]==s2[i] )
  11:             ++i ;
  12:         //now , s1[i]!=s2[i] , the edit distance is 1 only when just i-th char is replaced
  13:         //other positions are the same , so skip this position
  14:         //hello hollo
  15:         while( ++i<len1 )
  16:             if( s1[i]!=s2[i] )
  17:                 return false ;
  18:     }
  19:     else if( len1 == len2+1 )
  20:     {
  21:         i = 0 ;
  22:         while( i<len2 && s1[i]==s2[i] )
  23:             ++i ;
  24:         //now , s1[i]!=s2[i] , the edit distance is 1 only when s1[i] is been insertd
  25:         //other positions are the same , so skip s1[i]
  26:         //more mre
  27:         while( ++i<len1 )
  28:             if( s1[i]!=s2[i-1] )
  29:                 return false ;
  30:     }
  31:     else if( len1+1 == len2 )
  32:     {
  33:         i = 0 ;
  34:         while( i<len1 && s1[i]==s2[i] )
  35:             ++i ;
  36:         //now , s1[i]!=s2[i] , the edit distance is 1 only when s2[i] is been insertd
  37:         //other positions are the same , so skip s2[i]
  38:         //mre more 
  39:         while( ++i<len2 )
  40:             if( s1[i-1]!=s2[i] )
  41:                 return false ;
  42:     }
  43:     else
  44:         return false ;
  45:  
  46:     return true ;
  47: }
  48:  
  49: //use check
  50: void run1035()
  51: {
  52:     vector<string> dictVec ;
  53:     dictVec.reserve(10000) ;
  54:     vector<string>::iterator dictVecIter ;
  55:  
  56:     stdext::hash_set<string> dictSet;    
  57:     string str,dictStr;
  58:     //ifstream in("in.txt");
  59:  
  60:     while( cin>>str && str!="#" )
  61:     {
  62:         dictVec.push_back(str);
  63:         dictSet.insert(str);
  64:     }
  65:     cin.clear() ;
  66:     while( cin>>str && str!="#" )
  67:     {
  68:         if( dictSet.find(str)!=dictSet.end() )
  69:         {
  70:             cout<<str<<" is correct"<<endl;
  71:             continue ;
  72:         }
  73:  
  74:         cout<<str<<": " ;    
  75:         for( dictVecIter=dictVec.begin() ; dictVecIter!=dictVec.end() ; ++dictVecIter )
  76:         {
  77:             dictStr = *dictVecIter ;
  78:             if( check(dictStr,str)==true )
  79:                 cout<<dictStr<<" " ;
  80:         }
  81:         cout<<endl ;
  82:     }
  83:  
  84:     dictVec.clear();
  85:     dictSet.clear();
  86: }
posted @ 2010-11-04 21:12  Allen Sun  阅读(581)  评论(0编辑  收藏  举报