*创建索引初步

在全文检索工具中,是由这样的三个部分组成:

1.索引部分,

2.分词部分,

3.搜索部分。

 

【创建索引】

步骤:

1.创建Directory(索引建立在什么地方?内存or硬盘)

2.创建IndexWriter.(通过IndexWriter来写索引)

3.创建Document对象。

4.位Document添加Field
   注:Filed是Document中的子元素。


5.通过IndexWriter添加文档到索引中

 

【实践】

创建索引代码:

HelloLucene.java:

 1 package com.hk.test;
 2 
 3 import java.io.File;
 4 import java.io.FileReader;
 5 import java.io.IOException;
 6 import java.io.Reader;
 7 import org.apache.lucene.analysis.TokenStream;
 8 import org.apache.lucene.analysis.standard.StandardAnalyzer;
 9 import org.apache.lucene.document.Document;
10 import org.apache.lucene.document.Field;
11 import org.apache.lucene.document.Fieldable;
12 import org.apache.lucene.index.CorruptIndexException;
13 import org.apache.lucene.index.IndexWriter;
14 import org.apache.lucene.index.IndexWriterConfig;
15 import org.apache.lucene.index.FieldInfo.IndexOptions;
16 import org.apache.lucene.store.Directory;
17 import org.apache.lucene.store.FSDirectory;
18 import org.apache.lucene.store.LockObtainFailedException;
19 import org.apache.lucene.store.RAMDirectory;
20 import org.apache.lucene.util.Version;
21 
22 public class HelloLucene {
23     /*
24      * 建立索引
25      */
26     public void index() {
27         IndexWriter writer = null;
28         
29         try {
30             //1.创建Directory(索引建立在什么地方?内存or硬盘)
31             //Directory directory = new RAMDirectory();//建立在内存中
32             Directory directory = FSDirectory.open(new File("D:/lucene/index01"));//创建在磁盘上
33             
34             //2.创建IndexWriter.(通过IndexWriter来写索引)
35             IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_35, new StandardAnalyzer(Version.LUCENE_35));
36             
37             writer = new IndexWriter(directory, iwc);
38             
39             //3.创建Document对象。
40             Document doc = null;
41             //4.为Document添加Field
42             File f = new File("D:/lucene");
43             for(File file:f.listFiles()){
44                 doc = new Document();
45                 doc.add(new Field("content",new FileReader(file)));
46                 //Field.Store.YES是否把这个文件的全名存储在硬盘中
47                 //Field.Index.ANALYZED.NOT_ANALYZED对文件名不需要分词,因为它本来就是一个词
48                 doc.add(new Field("fileName",file.getName(),Field.Store.YES,Field.Index.ANALYZED.NOT_ANALYZED));
49                 doc.add(new Field("path",file.getAbsolutePath(),Field.Store.YES,Field.Index.NOT_ANALYZED));
50                 
51             }
52             //5.通过IndexWriter添加文档到索引中
53             
54             
55         } catch (CorruptIndexException e) {
56             // TODO Auto-generated catch block
57             e.printStackTrace();
58         } catch (LockObtainFailedException e) {
59             // TODO Auto-generated catch block
60             e.printStackTrace();
61         } catch (IOException e) {
62             // TODO Auto-generated catch block
63             e.printStackTrace();
64         }finally{
65             if(writer != null)
66                 try {
67                     writer.close();
68                 } catch (CorruptIndexException e) {
69                     // TODO Auto-generated catch block
70                     e.printStackTrace();
71                 } catch (IOException e) {
72                     // TODO Auto-generated catch block
73                     e.printStackTrace();
74                 }
75         }
76     }
77 }

 

测试索引

testIndex.java:

 1 package com.hk.test;
 2 
 3 import org.junit.Test;
 4 
 5 public class TestLucene {
 6     @Test
 7     public void testIndex(){
 8         HelloLucene hello = new HelloLucene();
 9         hello.index();
10     }
11 
12 }

 

运行结果:

 

posted @ 2018-10-25 14:50  猩生柯北  阅读(157)  评论(0编辑  收藏  举报