使用Hibernate+solr取代hibernate search

尝试使用solr取代hibernate search的方法,因本人对二者没有全面的了解,对二者都只是使用API级别,本文仅供参考。

hibernate 4.1,solr3.6.0

本人已实现,

1.开箱即用的solr,就象hibernate search,只要引入JAR包,会自己注入事件,当sessionFactory初始化结束后,即更新schema.xml.

2.hibernate进行insert,update,delete操作时,将数据提交到solr进行相应操作,都是基于事件操作

3.查询接口提供

待:更多查询的实现。

二者简介介绍

以下摘自各自官网

Solr

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.

From:http://lucene.apache.org/solr/

Hibernate Search

Hibernate Search brings the power of full text search engines to the persistence domain model by combining Hibernate Core with the capabilities of the Apache Lucene™ search engine.

From:http://hibernate.org/subprojects/search.html

 

原因

二者都是使用的apache lucene这个项目作为其基础。

Hibernate Search的优点:

1.hibernate search确实简单,加几个注解,配置几个选项,再只要使用FullText Search查询API即可操作,对索引的管理都不要操心。

2.与hibernate集成完美,api使用与hiberante的API类似,上手度很高

hibernate search的不足:

1.它只能对一个实体内的数据进行索引查询,而不能对多个表全局查询,这个可以看到它生成的索引文件即可看出,一个实体一个文件夹保存索引

image

image

2.分组查询等复杂查询支持度不够,highlight查询等没有,上述介绍中,solr介绍的兰色字体貌似都没有

3.solr对lucene进行了很多二次封装,以及管理功能,方便扩展,管理,比如读写分离

 

目标

集成hibernate与solr,使用时,就象使用hibernate search那么简单.

实验版本为:

<org.hibernate.version>4.1.1.Final</org.hibernate.version>
<solr.version>3.6.0</solr.version>

 

步骤

1.自动增加实体的solr字段结构(在系统启动的时候,mapping绑定时,处理hibernate search的注解如@Indexed,@Field的字段).这是因为solr在添加索引之前,所有字段必须在schema.xml里定义,不然会报错。对实体的解析,直接使用hibernate search已有代码进行操作。省却不少工作量。

2.索引变更处理(处理hibernate events),包括新增,更新,删除

3.索引查询(查询API提供,简单封装solrj)

 

详细步骤:

1.entity到solr字段结构定义处理

这个参考的是hibernate search的实现方式:

实现接口org.hibernate.integrator.spi.Integrator即可。

如hibernate search:

public class HibernateSearchIntegrator implements Integrator {

    private static final Log log = LoggerFactory.make();
    public static final String AUTO_REGISTER = "hibernate.search.autoregister_listeners";

    private FullTextIndexEventListener listener;

    @Override
    public void integrate(
            Configuration configuration,
            SessionFactoryImplementor sessionFactory,
            SessionFactoryServiceRegistry serviceRegistry) {
       

它是使用java ServiceLoader的把Integrator接口下的实现自动加载进来(以下二段是原理讲解,须在META-INFO下定义实现的位置):

org.hibernate.integrator.internal.IntegratorServiceImpl.IntegratorServiceImpl(LinkedHashSet<Integrator>, ClassLoaderService):

for ( Integrator integrator : classLoaderService.loadJavaServices( Integrator.class ) )

 

org.hibernate.service.classloading.internal.ClassLoaderServiceImpl.loadJavaServices(Class<S>):

final ServiceLoader<S> loader = ServiceLoader.load( serviceContract, serviceLoaderClassLoader );

 

org.hibernate.internal.SessionFactoryImpl:

for ( Integrator integrator : serviceRegistry.getService( IntegratorService.class ).getIntegrators() ) {
        integrator.integrate( cfg, this, this.serviceRegistry );
        integratorObserver.integrators.add( integrator );
    }
最后被加载到了SessionFactory里边了。

加载完了,会自己执行:

org.hibernate.integrator.spi.Integrator.integrate(Configuration, SessionFactoryImplementor, SessionFactoryServiceRegistry)

在org.hibernate.internal.SessionFactoryImpl.SessionFactoryImpl(Configuration, Mapping, ServiceRegistry, Settings, SessionFactoryObserver):

for ( Integrator integrator : serviceRegistry.getService( IntegratorService.class ).getIntegrators() ) {
            integrator.integrate( cfg, this, this.serviceRegistry );
            integratorObserver.integrators.add( integrator );
        }

在自己实现的类里,提交实体的field定义到solr:

@Override
    public void integrate(Configuration configuration, SessionFactoryImplementor sessionFactory,
            SessionFactoryServiceRegistry serviceRegistry) {
        // TODO Auto-generated method stub
        System.out.println(configuration);
        Iterator<PersistentClass> classMappings = configuration.getClassMappings();
        while(classMappings.hasNext()) {
            PersistentClass next = classMappings.next();//得到实体列表

              //提交到solr:localhost:8080/solr/updateschema

提交转成json提交:

SolrServer server = new CommonsHttpSolrServer("http://localhost:8082/solr-server");
        UpdateRequest req = new UpdateRequest("/updateschema");
        SolrInputDocument doc1 = new SolrInputDocument();
        doc1.addField("name_sex_girl", "{\"type\":\"maxWord\",\"indexed\":true,\"stored\":false,\"termVectors\":false,\"termPositions\":false,\"termOffsets\":false}");

        Collection<SolrInputDocument> docs = new ArrayList<SolrInputDocument>();
        docs.add(doc1);
        req.add(docs);
        UpdateResponse rsp = req.process(server);

当然,localhost:8080/solr/updateschema是咱们自己定义的,它专门处理update schema的处理

在solrconf.xml里加入自己实现的:

<requestHandler name="/updateschema" class="com.xia.search.solr.hanlder.SchemaUpdateHanlder">
  </requestHandler>

SchemaUpdateHanlder 实现

public class SchemaUpdateHanlder extends SchemaStreamHandlerBase{

      @Override
      protected ContentStreamLoader newLoader(SolrQueryRequest req, UpdateRequestProcessor processor) {
        return new SchemaXMLLoader(processor, inputFactory);
      }

相关实现

public class SchemaUpdateProcessorFactory extends UpdateRequestProcessorFactory {
    @Override
    public UpdateRequestProcessor getInstance(SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next) {
        return new SchemaUpdateProcessor(req, next);
    }
}

 

class SchemaUpdateProcessor extends UpdateRequestProcessor {
    private final SolrQueryRequest req;
    private final UpdateHandler updateHandler;

    public SchemaUpdateProcessor(SolrQueryRequest req, UpdateRequestProcessor next) {
        super(next);
        this.req = req;
        this.updateHandler = new SchemaUpdateHandler(req.getCore());
    }

    @Override
    public void processAdd(AddUpdateCommand cmd) throws IOException {
        // FIXME modify schema here
        SolrInputDocument docs = cmd.getSolrInputDocument();
        for (SolrInputField field : docs) {
            String name = field.getName();
            SchemaField field2 = req.getSchema().getFieldOrNull(name);
            if(null==field2) {
                Field f=JasonUtil.toObjectFromJson(field.getValue().toString(), Field.class);
                  SolrConfig.getSchemaConfig().addField(f);
            }

查看schema.xml,果然有了:

<field name="xxx" type="maxWord" indexed="true" stored="false" termVectors="false" termPositions="false" termOffsets="false"/>

等全部加载完成后,再调用solr reload指令,使配置生效。

http://localhost:8081/solr/admin/cores?action=RELOAD&core=collection1

我的是代码调用的:

 1     if (needReload) {
 2             // schema结束后,重载
 3             CoreContainer coreContainer = req.getCore().getCoreDescriptor().getCoreContainer();
 4             coreContainer.getCoreNames();
 5             for (String coreName : coreContainer.getCoreNames()) {
 6                 try {
 7                     coreContainer.reload(coreName);
 8                 } catch (ParserConfigurationException e) {
 9                     // TODO Auto-generated catch block
10                     e.printStackTrace();
11                 } catch (SAXException e) {
12                     // TODO Auto-generated catch block
13                     e.printStackTrace();
14                 }
15             }
16         }

 

 

2.索引变更处理

hibernate4变更了对事件的处理机制,之前可以在配置文件里使用的方式已不适用:

<property name="eventListeners"> <map> <entry key="post-update"> <ref bean="myListener" /> </entry> </map> </property>

我暂时找到的是,使用代码方式

listener = new FullTextIndexEventListener(  );

        EventListenerRegistry listenerRegistry = ((SessionFactoryImpl) sessionFactory).getServiceRegistry().getService(
                EventListenerRegistry.class);
        //TODO if the event is duplicated, do not initialize the newly created listener
        listenerRegistry.addDuplicationStrategy( new DuplicationStrategyImpl( FullTextIndexEventListener.class ) );
        listenerRegistry.getEventListenerGroup( EventType.POST_INSERT ).appendListener( listener );
        listenerRegistry.getEventListenerGroup( EventType.POST_UPDATE ).appendListener( listener );
        listenerRegistry.getEventListenerGroup( EventType.POST_DELETE ).appendListener( listener );
        listenerRegistry.getEventListenerGroup( EventType.POST_COLLECTION_RECREATE ).appendListener( listener );
        listenerRegistry.getEventListenerGroup( EventType.POST_COLLECTION_REMOVE ).appendListener( listener );
        listenerRegistry.getEventListenerGroup( EventType.POST_COLLECTION_UPDATE ).appendListener( listener );
        listenerRegistry.getEventListenerGroup( EventType.FLUSH ).appendListener( listener );
        listenerRegistry.getEventListenerGroup( EventType.LOAD ).appendListener( listener );      
       
        IntegratorService integratorService = ((SessionFactoryImpl) sessionFactory).getServiceRegistry().getService(
                IntegratorService.class);

 

事件的实现:

public class FullTextIndexEventListener implements PostDeleteEventListener,
    PostInsertEventListener, PostUpdateEventListener,
    PostCollectionRecreateEventListener, PostCollectionRemoveEventListener,
    PostCollectionUpdateEventListener, FlushEventListener,LoadEventListener{

实现时,对索引的提交,与updateschema相同:

SolrServer server = new CommonsHttpSolrServer("http://localhost:8082/solr-server");
UpdateRequest req = new UpdateRequest("/update");

这下,solr里就有数据了

image

接下来,想怎么查询,还不是看自己去组合solr的参数罢了。

如代码

public <T> Page<T> query(Query q) {
        Page<T> ret = new Page<T>();

        try {
            SolrQuery query = new SolrQuery();
            query.setStart(q.getStart());
            query.setRows(q.getRows());
            query.setQuery(q.toString());
            for (Entry<String, ORDER> order : q.getOrder().entrySet()) {
                query.addSortField(order.getKey(), order.getValue());
            }
            QueryResponse rsp = solrServer.query(query);
            ret.setNumFound(rsp.getResults().getNumFound());
            ret.setqTime(Long.valueOf(rsp.getHeader().get("QTime").toString()));

            logger.info(rsp.getResults());
            @SuppressWarnings("unchecked")
            List<T> beans = getBinder().getBeans(q.getQueryClazz(), rsp.getResults());

            ret.setResult(beans);
        } catch (Exception e) {
            logger.error("SOLR查询出错:" + q.toString(), e);
            throw new MyException("SOLR查询出错:" + q.toString(), e);
        }
        return ret;
    }

posted on 2012-07-02 16:10  夏雨的天空  阅读(1733)  评论(1编辑  收藏  举报

导航