新的发现

我们发现，可以用那个Lucene来代替elasticsearch，也可以进行实现，但是我们还是不太了解，我们进一步了解和学习。泽雷和东帅分别从github和gitee拔取有关搜索的代码，一个是关于solar一个关于Lucene，发现Lucene好像容易实现。

我们在网上搜索了一下Lucene相关的知识：

Lucene是一个开源的文本搜索和信息检索库，它提供了丰富的API和工具，用于索引、搜索和查询文本数据。它使用Java编写，由Apache Software Foundation维护和开发。Lucene被广泛应用于各种应用程序，如搜索引擎、电子商务、内容管理系统、自然语言处理等领域。
Lucene的核心功能涵盖了文本索引、查询和过滤，可以让用户快速地搜索和检索大量的文本数据。其核心原理是将文本数据转换成索引，以便在搜索时快速检索匹配数据。Lucene支持各种不同类型的文本数据，包括文本文件、HTML文档、XML文档、PDF文档等。
除了基本的文本索引和查询功能，Lucene还提供了其他一些高级功能，例如多字段搜索、范围查询、模糊搜索等。Lucene还可以与Solr和Elasticsearch等搜索引擎一起使用，以提供更快和更高级的搜索和信息检索能力。
总之，Lucene是一种强大的文本搜索和信息检索技术，它提供了高性能、灵活和可扩展的文本搜索和信息检索解决方案。

这目前是我们所了解的一些知识，

创建索引：

import java.io.File;
import java.io.IOException;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

public class Indexer {
    private IndexWriter writer;

    public Indexer(String indexDir) throws IOException {
        Directory dir = FSDirectory.open(new File(indexDir).toPath());
        IndexWriterConfig config = new IndexWriterConfig(new StandardAnalyzer());
        writer = new IndexWriter(dir, config);
    }

    public void close() throws IOException {
        writer.close();
    }

    public void indexFile(File file) throws IOException {
        Document doc = new Document();
        Field content = new Field("content", new FileReader(file), TextField.TYPE_NOT_STORED);
        Field fileName = new StringField("fileName", file.getName(), Field.Store.YES);
        Field filePath = new StringField("filePath", file.getCanonicalPath(), Field.Store.YES);

        doc.add(content);
        doc.add(fileName);
        doc.add(filePath);

        writer.addDocument(doc);
    }
}

搜索文本：

import java.io.IOException;
import java.nio.file.Paths;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.queryparser.classic.ParseException;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.FSDirectory;

public class Searcher {
    public void search(String indexDir, String queryStr) throws IOException, ParseException {
        IndexReader reader = DirectoryReader.open(FSDirectory.open(Paths.get(indexDir)));
        IndexSearcher searcher = new IndexSearcher(reader);
        QueryParser parser = new QueryParser("content", new StandardAnalyzer());
        Query query = parser.parse(queryStr);
        TopDocs docs = searcher.search(query, 10);
        ScoreDoc[] hits = docs.scoreDocs;
        for (int i = 0; i < hits.length; i++) {
            int docId = hits[i].doc;
            Document d = searcher.doc(docId);
            System.out.println((i + 1) + ". " + d.get("fileName") + " (" + d.get("filePath") + ")");
        }
        reader.close();
    }
}

　　这些代码演示了如何使用Lucene创建索引和搜索文本。创建索引的代码将文本文件转换为Document对象，并将其添加到索引中。搜索文本的代码使用QueryParser解析查询字符串创建查询对象，并通过IndexSearcher搜索文本。最后，它将匹配的文档的名称和路径打印到控制台上。

posted @ 2023-05-14 23:50 阖家旺阅读(15) 评论(0) 收藏举报

刷新页面返回顶部

Whd6

新的发现

公告