solr学习记录

 

官方资源
https://archive.apache.org/dist/lucene/solr/
IK Analyzer

理论知识
https://www.cnblogs.com/javawxid/p/12812016.html
实战demo
https://blog.csdn.net/luo609630199/article/details/82494708
https://blog.csdn.net/Cs_hnu_scw/article/details/79388080
solr7&solr8
https://blog.csdn.net/loumoxiaozi/article/details/81186916
https://blog.csdn.net/u010510107/article/details/81051795
https://blog.csdn.net/bskfnvjtlyzmv867/article/details/80940089
https://www.cnblogs.com/yangk1996/p/12657671.html
https://www.cnblogs.com/zhi-leaf/p/11601253.html
https://www.cnblogs.com/zhi-leaf/p/11602253.html
https://www.cnblogs.com/zhi-leaf/p/11604289.html
https://www.cnblogs.com/zhi-leaf/p/11605092.html
https://www.cnblogs.com/zhi-leaf/p/11605289.html
https://www.cnblogs.com/zhi-leaf/p/11605092.html
https://www.cnblogs.com/zhi-leaf/p/11605092.html
https://blog.csdn.net/lhc0512/article/details/82315117
https://www.cnblogs.com/ITDreamer/p/10661873.html
https://www.cnblogs.com/ITDreamer/p/10661949.html
https://www.cnblogs.com/leslia/p/9544202.html
https://www.cnblogs.com/flyingaway/p/8058914.html
系列
https://www.cnblogs.com/gaogaoyanjiu/p/7798187.html
https://www.cnblogs.com/gaogaoyanjiu/p/7837389.html
https://www.cnblogs.com/gaogaoyanjiu/p/7837389.html


高亮
https://www.cnblogs.com/LUA123/p/8283131.html
https://www.cnblogs.com/itdragon/p/8007144.html
word pdf索引
https://www.cnblogs.com/woshuaile/p/12176462.html

SolrJ
8以上版本
http://zpycloud.com/archives/1110
https://blog.csdn.net/Hello_World_QWP/article/details/98331238
https://www.itsource.cn/web/news/2118.html
https://segmentfault.com/a/1190000012349643
查询
http://zpycloud.com/archives/1146
https://www.cnblogs.com/jepson6669/p/9142676.html
====
最新版8.6
https://xuexiyuan.cn/article/detail/109.html
https://blog.csdn.net/Hello_World_QWP/article/details/98331238
https://www.cnblogs.com/carlosouyang/p/11352779.html
https://www.cnblogs.com/jepson6669/p/9142676.html
https://www.cnblogs.com/frankdeng/p/9615856.html
org.apache.commons.e
https://bbs.huaweicloud.com/blogs/102897
http://www.voidcn.com/article/p-oyziqtgk-bkw.html

定时增量更新
https://blog.csdn.net/h0713/article/details/82503848
https://www.cnblogs.com/milude0161/p/9228547.html
https://www.cnblogs.com/Pcaiwen/p/9588075.html
https://blog.csdn.net/qq_21046965/article/details/86771013
定时任务:
使用系统的定时任务执行 curl
http://your_ip/dataimport?command=full-import&clean=true&commit=true 全量导入
http://your_ip/dataimport?command=delta-import&clean=false&commit=true 差异导入

这里clean参数如果是true,会把之前的数据清空掉,然后导入差异的数据,在差异性导入时注意这个参数,不然会把solr里的数据清空,然后导入差异性的数据(这个差异是清空前的差异),导致数据缺失;
建议访问低频时重建全量索引,如每天凌晨4点做一次全量导入,每10分钟做一次差异导入

demo
https://gitee.com/Getawy/solr-tomcat
https://gitee.com/liudaac/IKAnalyzer2017_6_6_0

常用命令
solr start –p 端口号 单机版启动solr服务
solr restart –p 端口号 重启solr服务
solr stop –p 端口号关闭solr服务
solr create –c name 创建一个core实例
solr create –c coretest
solr create –c news
solr create –c sempbasecasus
solr create –c sempbasedanger
solr create –c sempbaseplanbase
solr create –c sempbaseknoexperi
solr create –c sempbaseknostand
solr create –c sempbaseknolaw
solr create –c sempqueryall
http://localhost:8983/solr
http://192.168.8.201:8983/solr
http://192.168.8.201:9999/solr

删除索引
<delete><query>*:*</query></delete>
<commit/>

1、第一步
中文分词
IK Analyzer配置
最新版地址
https://www.cnblogs.com/bxcsx/p/11599650.html
https://github.com/magese/ik-analyzer-solr/tree/v8.3.0
lucene-analyzers-smartcn
ansj中文分词器
(1)、将下载好的jar包放入solr-7.3.0/server/solr-webapp/webapp/WEB-INF/lib目录中
将jar包放入Solr服务的Jetty或Tomcat的webapp/WEB-INF/lib/目录下;
(2)、将resources目录下的5个配置文件放入solr服务的Jetty或Tomcat的webapp/WEB-INF/classes/目录下;

① IKAnalyzer.cfg.xml
② ext.dic
③ stopword.dic
④ ik.conf
⑤ dynamicdic.txt
(3)、配置Solr的managed-schema,添加ik分词器,示例如下;
然后到solr/server/solr/mycore/conf目录中打开managed-schema文件
<!-- ik分词器 -->
<fieldType name="text_ik" class="solr.TextField">
<analyzer type="index">
<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false" conf="ik.conf"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true" conf="ik.conf"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>

<!-- ik分词器 -->
<fieldType name="text_ik" class="solr.TextField">
<analyzer type="index">
<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false" conf="ik.conf"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true" conf="ik.conf"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>


2、第二步配置 DIH全称是Data Import Handler
(1)、修改$SOLR_HOME/server/solr/<索引库名>/conf/solrconfig.xml,添加如下内容:.
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>

(2)、在$SOLR_HOME/server/solr/<索引库名>/conf/文件目录下创建data-config.xml文件,,
也可从以下目录拷贝,E:\搜素引擎\solr-8.6.0\example\example-DIH\solr\db\conf,内容如下:
column和数据表中列相同,name和schema中的相同
Oracle配置
<dataConfig>
<dataSource driver="oracle.jdbc.driver.OracleDriver" url="jdbc:oracle:thin:@192.168.2.218:1521:product " user="数据库用户名" password="数据库密码" />
<document name=”product” pk=”主键”>
<entity name="bless" query="select * from bless"<--这里配查询语句-->
deltaImportQuery="SELECT * FROM userinfo where spuid='${dih.delta.spuid}'"
deltaQuery="select bless_id from bless where bless_time > '${dataimporter.last_index_time}'"><--这里配增量查询语句,${dataimporter.last_index_time}表示上次更新时间-->
</entity>
</document>
</dataConfig>
MySQL配置
<?xml version="1.0" encoding="UTF-8" ?>
<dataConfig>
<dataSource type="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/hobart-solr"
user="root"
password="root"/>
<document>
<entity name="user" query="SELECT * FROM user ">
<field column="uid" name="uid"/>
<field column="usercode" name="user_code"/>
<field column="account" name="account"/>
<field column="password" name="password"/>
<field column="username" name="user_name"/>
<field column="birthday" name="birthday"/>
<field column="gender" name="gender"/>
<field column="lastLoginTime" name="last_login_time"/>
<field column="updateTime" name="update_time"/>
</entity>
</document>
</dataConfig>

(3)、修改$SOLR_HOME/server/solr/<索引库名>/conf/目录下的managed-schema文件,添加如下内容:
打开managed-schema文件,在其中定义Field域。Field域的name属性需要和data-config.xml 的一致。

<field name="uid" type="string" indexed="true" stored="true" />
<field name="user_code" type="string" indexed="true" stored="true" />
<field name="user_account" type="string" indexed="true" stored="true" />
<field name="user_username" type="string" indexed="true" stored="true" />
<field name="user_birthday" type="string" indexed="true" stored="true" />
<field name="user_address" type="string" indexed="true" stored="true" />
<field name="user_birthday" type="string" indexed="true" stored="true" />
<field name="user_gender" type="string" indexed="true" stored="true" />
<field name="user_last_login_time" type="string" indexed="true" stored="true" />
<field name="user_update_time" type="string" indexed="true" stored="true" />

(4)、将一下jar 包 拷入 $SOLR_HOME/server/solr-webapp/webapps/WEB-INF/lib/ 下:
拷贝目录E:\搜素引擎\solr-8.6.0\dist
不拷贝的话需要配置路劲,在solrconfig.xml中配置jar包的lib标签;

<!-- 配置dataimport和mysql -->
<lib dir="${solr.install.dir:..}/contrib/db/lib" regex=".*\.jar" />
<lib dir="${solr.install.dir:..}/dist/" regex="solr-dataimporthandler-7.4.0.jar" />

solr-dataimporthandler.jar
solr-dataimporthandler-extras.jar
mysql-connector-java.jar(因为我用的 是Mysql 所以拷mysql的驱动包,其他数据库对应拷其驱动包)
Oracle下载ojdbc6.jar
(5)、执行
http://localhost:8983/solr/索引库/dataimport?command=full-import

 

3、第三步,

String baseUrl = ResourceUtil.getConfigByName("sempbasecasus");
String baseUrlall = ResourceUtil.getConfigByName("sempbasequeryall");

删除
deleteDocumentById(sempBaseCasus.getId(),baseUrl);
deleteDocumentById(sempBaseCasus.getId(),baseUrlall);

批量删除
deleteDocumentById(id,baseUrl);
deleteDocumentById(id,baseUrlall);

添加
addDocument(sempBaseCasus,baseUrl);
addDocumentAll(sempBaseCasus,baseUrlall);

更新
deleteDocumentById(sempBaseCasus.getId(),baseUrl);
deleteDocumentById(sempBaseCasus.getId(),baseUrlall);
addDocument(t,baseUrl);
addDocumentAll(t,baseUrlall);

http://localhost:8080/solrj/news
http://localhost:8080/solrj/news/search
http://localhost:8080/solrj/css/pager.css
pagenumber:${pageNumber},
pagecount:${totalPages},

<fieldType name="pdate" class="solr.DatePointField" docValues="true"/>
<fieldType name="pdates" class="solr.DatePointField" docValues="true" multiValued="true"/>

日期问题
TrieDateField 日期类型字段
<fieldType name="date" class="solr.TrieDateField"
sortMissingLast="true" omitNorms="true"/>
<fieldType name="pdate" class="solr.DatePointField" docValues="true"/>
<fieldType name="pdates" class="solr.DatePointField" docValues="true" multiValued="true"/>


https://www.jianshu.com/p/8f65ffbd5c74


import javax.persistence.Transient;
import org.apache.solr.client.solrj.beans.Field;

@Field("id")

@Transient
public String getCreatedatesolr() {
return createdatesolr;
}

public void setCreatedatesolr(String createdatesolr) {
this.createdatesolr = createdatesolr;
}

@Field("CREATE_DATE")
private java.lang.String createdatesolr;

@Transient
public String getUPDATE_DATE() {
return UPDATE_DATE;
}

public void setUPDATE_DATE(String UPDATE_DATE) {
this.UPDATE_DATE = UPDATE_DATE;
}

@Field("UPDATE_DATE")
private java.lang.String UPDATE_DATE;


posted @ 2020-08-01 11:10  DarJeely  阅读(123)  评论(0)    收藏  举报