elasticsearch v6.5.3 v7.17.4 mysql 热加载 扩展词 停止词 同义词

安装

官网

https://www.elastic.co/cn/downloads/past-releases/elasticsearch-6-5-3

kibana

https://www.elastic.co/cn/downloads/past-releases/kibana-6-5-3

kibana汉化

https://github.com/JGMa-java/kibana_zh-cn

es 内容

es 和 数据库 的联系

分词案例

原生的

http://localhost:9200/test
{
  "settings": {
    "analysis": {
      "analyzer": {
        "ik_max_word": {
          "type": "custom",
          "tokenizer": "ik_max_word"
        }
      }
    }
  },
  "mappings": {
    "doc": {
      "properties": {
        "content": {
          "type": "text",
          "analyzer": "ik_max_word"
        }
      }
    }
  }
}
http://localhost:9200/test/doc
{
  "content": "北京我来了!"
}
http://localhost:9200/test/_search
{
  "query": {
    "match": {
      "content": "我"
    }
  }
}
http://localhost:9200/test/_analyze
{
  "text":"我是中国人"
}

ik 相关内容博客

https://zhuanlan.zhihu.com/p/381913918
https://www.cnblogs.com/bubu99/p/13599687.html

安装 分词插件

https://github.com/medcl/elasticsearch-analysis-ik/releases/tag/v6.5.3
可以先下载zip
到相应目录

elasticsearch-plugin install file:///D:/opt/es/elasticsearch-6.5.3/elasticsearch-analysis-ik-6.5.3.zip

下面命令访问github网络不稳定容易不成功

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.5.3/elasticsearch-analysis-ik-6.5.3.zip

然后重启es

http://localhost:9200/test/_analyze
{
  "analyzer": "ik_smart", 
  "text": "我是中国人"
}

http://localhost:9200/test/_analyze
{
  "analyzer": "ik_max_word", 
  "text": "我是中国人"
}

自定义词库

D:\opt\es\elasticsearch-6.5.3\config\analysis-ik



然后重启es

http://localhost:9200/test/_analyze
{
  "analyzer": "ik_max_word", 
  "text": "我是中国人"
}

定义远程词库

安装nginx
在html目录下

修改文件内容
远程文件是uft8编码

es也会重新加载

mysql 词库

下载相应版本源码

https://github.com/medcl/elasticsearch-analysis-ik/tree/v6.5.3

idea 打开

maven

        <dependency>
            <groupId>mysql</groupId>
            <artifactId>mysql-connector-java</artifactId>
            <version>8.0.26</version>
        </dependency>

jdbc.yml

jdbc:
	  url: jdbc:mysql://localhost:3306/esciku?useUnicode=true&autoReconnect=true&failOverReadOnly=false&characterEncoding=utf8&useSSL=false&serverTimezone=UTC
	  user: root
	  password : root
	  sql: SELECT  keyword FROM hot_words WHERE flag=0

plugin.xml

        <dependencySet>
            <outputDirectory/>
            <useProjectArtifact>true</useProjectArtifact>
            <useTransitiveFiltering>true</useTransitiveFiltering>
            <excludes>
                <exclude>mysql:mysql-connector-java</exclude>
            </excludes>
        </dependencySet>

代码

    static {
        try {
            //利用反射把mysql驱动加载到内存
            Class.forName("com.mysql.cj.jdbc.Driver").newInstance();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }


    /**
     * 从mysql加载热更新词典
     */
    private void loadExtDictByMysql() {
        Connection conn = null;
        Statement stmt = null;
        ResultSet rs = null;
        Properties prop;
        InputStream inputStream = null;
        try {
            prop = new Properties();
            inputStream = Files.newInputStream(PathUtils.get(getDictRoot(), "jdbc.yml").toFile().toPath());
            prop.load(inputStream);
            conn = DriverManager.getConnection(
                    prop.getProperty("url"),
                    prop.getProperty("user"),
                    prop.getProperty("password"));
            stmt = conn.createStatement();
            rs = stmt.executeQuery(prop.getProperty("sql"));

            while (rs.next()) {
                String theWord = rs.getString("keyword");
                _MainDict.fillSegment(theWord.trim().toCharArray());
            }
            logger.info("从mysql加载热更新词典成功!");
        } catch (Exception e) {
            logger.error("error", e);
        } finally {
            try {
                if (inputStream != null) {
                    inputStream.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
            if (rs != null) {
                try {
                    rs.close();
                } catch (SQLException e) {
                    logger.error("error", e);
                }
            }
            if (stmt != null) {
                try {
                    stmt.close();
                } catch (SQLException e) {
                    logger.error("error", e);
                }
            }
            if (conn != null) {
                try {
                    conn.close();
                } catch (SQLException e) {
                    logger.error("error", e);
                }
            }
        }
    }

scheduleAtFixedRate 方法的参数如下: 
- 第一个参数是一个Lambda表达式或匿名函数,用于执行具体的任务逻辑。 
- 第二个参数是初始延迟时间,表示任务在启动后多久开始执行第一次。 
- 第三个参数是任务执行的周期,表示任务之间的时间间隔。 
- 第四个参数是时间单位,用于指定第二个和第三个参数的时间单位。 
                        // mysql 资产热加载
                        pool.scheduleAtFixedRate(() -> Dictionary.getSingleton().loadExtDictByMysql(), 10, 120, TimeUnit.SECONDS);

打包项目: mvn package,执行完之后会在elasticsearch-analysis-ik\target\releases文件夹下生成一个新的 elasticsearch-analysis-ik-6.5.3.zip 压缩包,解压之后将elasticsearch-analysis-ik-6.5.3.jar 和 mysql-connector-java-6.0.6.jar 都拷贝到es文件中plugins\ik文件夹下即可

问题1


这个是JRE的类的创建设值权限不对
在jre/lib/security文件夹中有一个java.policy文件,在其grant{}中加入授权即可

permission java.lang.RuntimePermission "createClassLoader"; 
permission java.lang.RuntimePermission "getClassLoader"; 
permission java.lang.RuntimePermission "accessDeclaredMembers";
permission java.lang.RuntimePermission "setContextClassLoader";

问题2


这个是通信链接等权限不对
也是,在jre/lib/security文件夹中有一个java.policy文件,在其grant{}中加入授权即可

permission java.net.SocketPermission "127.0.0.1:3306","accept";
permission java.net.SocketPermission "127.0.0.1:3306","listen";
permission java.net.SocketPermission "127.0.0.1:3306","resolve";
permission java.net.SocketPermission "127.0.0.1:3306","connect";

最后

加了个我是中国

测试

{
  "analyzer": "ik_max_word", 
  "text": "我是中国人"
}


插件代码
https://gitee.com/linzm1007/elasticsearch-analysis-ik-6.5.4-mysql8
https://gitee.com/linzm1007/elasticsearch-analysis-ik-7.x-mysql8

同义词


创建index

http://localhost:9200/synonyms_index
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "index": {
      "analysis": {
        "filter": {
          "doc_synonym": {
            "type": "synonym",
            "synonyms": ["苹果,iphone,ipad","理想,理想汽车"]
          }
        },
        "analyzer": {
          "my_doc_syno": {
            "type": "custom",
            "tokenizer": "ik_smart",
            "filter": [
              "doc_synonym"
            ]
          }
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "my_doc_syno"
      }
    }
  }
}
http://localhost:9200/synonyms_index/_analyze
{
  "field": "name",
  "text": "苹果,理想"
}

https://zhuanlan.zhihu.com/p/381936025
代码
https://gitee.com/linzm1007/elasticsearch-analysis-dynamic-synonym-7.x-mysql8

posted @ 2023-09-16 23:40  linzm14  阅读(9)  评论(0编辑  收藏  举报