算法Sedgewick第四版-第1章基础-023-MultiwordSearch.java

 

Multi-word search. Program MultiwordSearch.java reads a sequence of query words q[1], ..., q[k] from the command line and a sequence of documents words d[1], ..., d[N] from standard input and finds the shortest interval in which the k words appear in the same order. (Here shortest means the number of words in the interval.) That is find indices i and j such that d[i1] = q[1], d[i2] = q[2], ..., d[ik] = q[k] and i1 < i2 < ... < ik.

Answer: for each query word, create a sorted list of the indices where it appears in the document. Scan through lists 2 to k in that order, deleting indices at the front of each list until the the first elements of the resulting k lists are in ascending order.

The sequence of first elements on the lists forms the shortest interval containing the first element on list 1.

Now delete the first element on list 1. Repeatedly delete elements from list 2 until it agrees with list 1. Repeat for list 3, and so on until the whole array is in ascending order. Check this sequence of first elements, etc.

 1 /******************************************************************************
 2  *  Compilation:  javac MultiwordSearch.java
 3  *  Execution:    java MultiwordSearch query1 query2 ... < input.txt
 4  *  Dependencies: Queue.java StdIn.java
 5  *
 6  *  Find the shortest interval (number of words) in the input file
 7  *  that contains the query words in the order specified on the command line.
 8  *
 9  ******************************************************************************/
10 
11 public class MultiwordSearch {
12     public static void main(String[] args) {
13         String[] words = StdIn.readAllStrings();
14 
15         // construct queues[j] = sequence of positions of jth query word
16         Queue<Integer>[] queues = (Queue<Integer>[]) new Queue[args.length];
17         for (int j = 0; j < args.length; j++) {
18             queues[j] = new Queue<Integer>();
19         }
20         for (int i = 0; i < words.length; i++) {
21             for (int j = 0; j < args.length; j++) {
22                 if (words[i].equals(args[j])) queues[j].enqueue(i);
23             }
24         }
25 
26         // repeatedly find smallest interval starting at position of queues[0]
27         boolean done = false;
28         int bestlo = -1, besthi = words.length;
29         while (!queues[0].isEmpty()) {
30             int lo = queues[0].dequeue();
31             int hi = lo;
32             for (int j = 1; j < args.length; j++) {
33                 while (!queues[j].isEmpty() && queues[j].peek() <= hi) {
34                     queues[j].dequeue();
35                 }
36                 if (queues[j].isEmpty())  {
37                     done = true;
38                     break;
39                 }
40                 else hi = queues[j].peek();
41             }
42             if (!done && hi - lo < besthi - bestlo) {
43                 besthi = hi;
44                 bestlo = lo;
45             }
46 
47         }
48 
49         if (bestlo >= 0) {
50             for (int i = bestlo; i <= besthi; i++)
51                 StdOut.print(words[i] + " ");
52             StdOut.println();
53         }
54         else
55             StdOut.println("NOT FOUND");
56     }
57 }

 

posted @ 2016-04-20 11:47  shamgod  阅读(282)  评论(0编辑  收藏  举报
haha