whoosh使用简介

whoosh,纯python的全文搜索引擎。这里记录简单使用,参考官方文档。

这里是我的代码,创建搜索文档(即索引文档),windows下操作。

#coding=utf-8
import os
from whoosh.index import create_in,open_dir
from whoosh import fields
WHOOSH_ADD = 'E:\whoosh_index'
WHOOSH_SCHEMA = fields.Schema(title=fields.TEXT(stored=True),
    content=fields.TEXT(stored=True),
    )
if not os.path.exists(WHOOSH_ADD):
    os.mkdir(WHOOSH_ADD)
    ix = create_in(WHOOSH_ADD,schema=WHOOSH_SCHEMA,indexname='comment')
ix = open_dir(WHOOSH_ADD,indexname='comment')
writer = ix.writer()
writer.add_document(title=u'chang yanjie add',content= u' zheng wen 我是正文',)

writer.add_document(title=u'chang yan1 jie2 add',content= u' zheng wen 我是正文2',)
writer.commit()

学习使用的同学们自己更改地址WHOOSH_ADD,

当然也有更新方法,

writer.update_document(title=u"chang yanjie add", content="变啦",)

搜索代码:

#coding=utf-8
from whoosh import index
from whoosh.qparser import QueryParser
ix = index.open_dir('E:\whoosh_index', indexname='comment')
hits = []
query = u' zheng'
parser = QueryParser("content", schema=ix.schema)   
try:
    word = parser.parse(query)
except:
    word = None
if word is not None:
    s = ix.searcher()
    hits = s.search(word)
    #with  ix.searcher() as s:              注意此处,如果使用with 方法的话,文件会自动closed()方法,下边将无法使用hits结果
    #    hits = s.search(word)
print len(hits)

正常结果应该是2,哈哈。

posted @ 2013-01-10 18:03  深秋的黎明  阅读(1877)  评论(0编辑  收藏  举报