elascsearch 使用入门

Elasticsearch

Elasticsearch

Elasticsearch简述

使用JAVA开发、基于Lucene搜索引擎库的全文搜索工具，通过RESTful API（一种接口设计规范，让接口更易懂）隐藏了Lucene原本的复杂性。实现了日志数据的分布式、实时分析，并且可以进行搜索补全与纠错等功能，是ELK最核心的组件。
相比MySQL库和表的概念，在ES中把库叫做索引

述语

Document 文档数据
Index 索引-可以理解成MySQL中的数据库
Type 索引中数据类型-可以理解成MySQL中的table
Feild 字段文档的属性
QueryDSL 查询语法

Elasticsearch安装

java安装

yum install java-1.8.0-openjdk.x86_64

采用tar包的方式进行安装

下载ElasticSearch，官网地址是www.elastic.co
下载tar包解压，然后进入config目录，该目录下除了有一个主配置文件elasticsearch.yml需要配置外，还有一个jvm.options文件用于JVM的调优

tar zxvf elasticsearch-6.3.tar.gz

查看配置文件并修改

cd elasticsearch-6.3/config

config目录下面包括elascsearch.yml，jvm.option,log4j2.perperites相关配置，具体配置说明后面再进行统一描述

运行 bin/elascsearch

运行成功，会有一个started,如果失败,则说明当前配置文件需要修改以达到操作系统配置对应的参数，具体会在后面配置文件说明时行讲述

采用docker的方式进行安装

拉取ElasticSearch镜像

docker pull elasticsearch:6.3

创建数据文件挂载目录,并开放通信端口
在centos窗口中，执行如下操作

[root@localhost soft]# pwd
/home/soft
[root@localhost soft]# mkdir -p ES/config
[root@localhost soft]# cd  ES 
[root@localhost ES]# mkdir data1
[root@localhost ES]# mkdir data2
[root@localhost ES]# mkdir data3
[root@localhost ES]# cd ES/config/
[root@localhost ES]# firewall-cmd --add-port=9300/tcp
success
[root@localhost ES]# firewall-cmd --add-port=9301/tcp
success
[root@localhost ES]# firewall-cmd --add-port=9302/tcp
success
[root@localhost ES]# chmod 777 data1 data2 data3

创建ElasticSearch配置文件,使用vim命令分别创建如下文件：es1.yml,es2.yml,es3.yml

es1文件

cluster.name: elasticsearch-cluster
node.name: es-node1
network.bind_host: 0.0.0.0
network.publish_host:10.100.157.208
http.port: 9200
transport.tcp.port: 9300
http.cors.enabled: true
http.cors.allow-origin: "*"
node.master: true 
node.data: true  
discovery.zen.ping.unicast.hosts: ["10.100.157.208:9300","10.100.157.208:9301","10.100.157.208:9302"]
discovery.zen.minimum_master_nodes: 2

es2文件

cluster.name: elasticsearch-cluster
node.name: es-node2
network.bind_host: 0.0.0.0
network.publish_host:10.100.157.208
http.port: 9201
transport.tcp.port: 9301
http.cors.enabled: true
http.cors.allow-origin: "*"
node.master: true 
node.data: true 
discovery.zen.ping.unicast.hosts: ["10.100.157.208:9300","10.100.157.208:9301","10.100.157.208:9302"]
discovery.zen.minimum_master_nodes: 2

es3文件

cluster.name: elasticsearch-cluster
node.name: es-node3
network.bind_host: 0.0.0.0
network.publish_host: 10.100.157.208
http.port: 9202
transport.tcp.port: 9302
http.cors.enabled: true
http.cors.allow-origin: "*"
node.master: true 
node.data: true  
discovery.zen.ping.unicast.hosts: ["10.100.157.208:9300","10.100.157.208:9301","10.100.157.208:9302"]
discovery.zen.minimum_master_nodes: 2

调高JVM线程数限制数量

vim /etc/sysctl.conf

vm.max_map_count=262144 

sysctl -p

这一步是为了防止启动容器时，报出如下错误：
bootstrap checks failed max virtual memory areas vm.max_map_count [65530] likely too low, increase to at least [262144]

启动ElasticSearch集群容器

docker run -e ES_JAVA_OPTS="-Xms256m -Xmx256m" -d -p 9200:9200 -p 9300:9300 -v /home/soft/ES/config/es1.yml:/usr/share/elasticsearch/config/elasticsearch.yml -v /home/soft/ES/data1:/usr/share/elasticsearch/data --name ES01 elasticsearch:6.3

docker run -e ES_JAVA_OPTS="-Xms256m -Xmx256m" -d -p 9201:9201 -p 9301:9301 -v /home/soft/ES/config/es2.yml:/usr/share/elasticsearch/config/elasticsearch.yml -v /home/soft/ES/data2:/usr/share/elasticsearch/data --name ES02 elasticsearch:6.3

docker run -e ES_JAVA_OPTS="-Xms256m -Xmx256m" -d -p 9202:9202 -p 9302:9302 -v /home/soft/ES/config/es3.yml:/usr/share/elasticsearch/config/elasticsearch.yml -v /home/soft/ES/data3:/usr/share/elasticsearch/data --name ES03 elasticsearch:6.3

注：设置-e ES_JAVA_OPTS="-Xms256m -Xmx256m" 是因为/etc/elasticsearch/jvm.options 默认jvm最大最小内存是2G，读者启动容器后可用docker stats命令查看,当然这种方式是动态设置的，我们后面可以在配置文件中直接设置

检验是否运行成功
在浏览器中运行http://10.100.157.208:9200/_cat/nodes?pretty 可以看到如下信息则说明基于docker的集群部署方式成功

10.100.157.208 47 61 0 0.02 0.02 0.05 mdi - es-node3
10.100.157.208 42 61 0 0.02 0.02 0.05 mdi - es-node1
10.100.157.208 32 61 0 0.02 0.02 0.05 mdi * es-node2

配置文件参数说明

elasticsearch.yml的相关节点配置

cluster.name: elasticsearch-cluster #集群名称，相同集群名称的节点会自动加入到该集群,默认值是：elasticsearch
node.name: es-node1 #节点名，默认随机指定一个name列表中名字。集群中node名字不能重复
path.data: /path/to/data #指定数据存储目录
path.logs: /path/to/logs #指定日志存储目录
index.number_of_shards: 2 #默认的配置是把索引分为5个分片
index.number_of_replicas: 1 #设置每个index的默认的冗余备份的分片数，默认是1
network.bind_host: 0.0.0.0 #设置可以访问的ip,可以是ipv4或ipv6的，默认为0.0.0.0，这里全部设置通过
network.publish_host: 192.168.9.219 #设置其它结点和该结点交互的ip地址，如果不设置它会自动判断，值必须是个真实的ip地址
http.port: 9200 #设置对外服务的http端口，默认为9200
transport.tcp.port: 9300 #设置节点之间交互的tcp端口，默认是9300
http.cors.enabled: true #是否允许跨域REST请求
http.cors.allow-origin: "*" #允许 REST 请求来自何处
node.master: true #配置该结点有资格被选举为主结点（候选主结点），用于处理请求和管理集群。如果结点没有资格成为主结点，那么该结点永远不可能成为主结点；如果结点有资格成为主结点，只有在被其他候选主结点认可和被选举为主结点之后，才真正成为主结点。
node.data: true #配置该结点是数据结点，用于保存数据，执行数据相关的操作（CRUD，Aggregation）；
discovery.zen.ping.unicast.hosts: ["es-node1:9300","es-node2:9301","es-node3:9302"] #集群中master节点初始化列表，通过列表中的机器来自动发现其他节点
discovery.zen.minimum_master_nodes: 2 #自动发现master节点的最小数，如果这个集群中配置进来的master节点少于这个数目，es的日志会一直报master节点数目不足。（默认为1）为了避免脑裂，个数请遵从该公式 => (totalnumber of master-eligible nodes / 2 + 1)。 * 脑裂是指在主备切换时，由于切换不彻底或其他原因，导致客户端和Slave误以为出现两个active master，最终使得整个集群处于混乱状态

elasticsearch-head的插件的使用

拉取镜像
启动容器
浏览器访问http://10.100.157.208:9100/

接口示例

集群管理相关

集群健康检查
通过cat、cluster两个API都可以进行集群健康检查，green代表集群完全正常；yellow代表集群正常，部分副本分片不正常；red代表集群故障，数据可能会丢失

http://10.100.157.208:9200/_cat/health

http://10.100.157.208:9200/_cat/health?v #显示信息更详尽

http://10.100.157.208:9200/_cluster/health

http://10.100.157.208:9200/_cluster/health?pretty

（加上pretty会将内容格式化再输出，更美观）

查询所有节点列表

http://10.100.157.208:9200/_cat/nodes?v

查询所有索引

http://10.100.157.208:9200/_cat/indices?v

curl命令在Elasticsearch中的使用

使用curl可以通过模拟http请求的方式去创建和管理索引，常用选项如下：

-X：指定http的请求方法，如HEAD,POST,PUT,DELETE

-d：指定要传输的数据

-H：指定http请求头信息

1、使用curl新增索引
curl -XPUT "localhost:9200/blog_test?pretty" #新增一个blog_test索引

2、删除索引
curl -X DELETE "localhost:9200/bolg_test?pretty"

3、查询创建的索引
curl http://localhost:9200/_cat/indices?v

4、修改索引分片设置以及副本

curl -XPUT ‘http://localhost:9200/_all/_settings?preserve_existing=true’ -d ‘{
“index.number_of_replicas” : “1”,
“index.number_of_shards” : “10”
}

使用elasticsearch操作(curd)

插入一条记录

Post /accounts/person/1
{
    "name":"liMing",
    "last":"system"
}

查询某一条记录

get /accounts/person/1

更改一条记录

Post /accounts/person/1/_update
{
    "name":"liMing1",
    "last":"system1"
}

删除一条记录

Delete /accounts/person/1

dsl查询在Elasticsearch中的使用

match_all查询所有人

GET /accounts/person/_search
{
  "query": {
    "match_all": {}
  }

查询姓名称包含 lili 的人，同时按照年龄降序排序：

GET /accounts/person/_search
{
  "query": {
    "match": {
      "name": "lili"
    }
  },
  "sort": [
    {
      "age": "desc"
    }
  ]
}

只查询某些需要的字段信息

GET /accounts/person/_search
{
  "query": {
    "match_all": {}
  },
  "_source": [
    "name",
    "age"
  ]
}

使用c#编程来调用elascsearch接口

首先要导入相关elasticsearch的访问包

dotnet add package ElasticsearchCRUD --version 2.4.1.1

示例：批量向elascsearch中导入一批数据

程序文件头部引入相关包的命名空间

using System;
using ElasticsearchCRUD;
using ElasticsearchCRUD.ContextAddDeleteUpdate.IndexModel.SettingsModel;
using ElasticsearchCRUD.Utils;

编写一个批量插入数据的方法

public void BulkInsert()
{		
           IElasticsearchMappingResolver elasticsearchMappingResolver = new ElasticsearchMappingResolver();
            elasticsearchMappingResolver.AddElasticSearchMappingForEntityType(typeof(Person), MappingUtils.GetElasticsearchMapping("sql"+DateTime.Now.ToString("yyyyMMdd")));
            using (var context = new ElasticsearchContext(ConnectionString, elasticsearchMappingResolver))
            {
                try
                {
                    context.IndexUpdateSettings(new IndexSettings { RefreshInterval = "-1", NumberOfReplicas = 0 });
                    int id = 1;
                    for (int i = 0; i < 100; i++)
                    {
                        for (int t = 0; t < 10000; t++)
                        {
                            var item = new Person
                            {
                                Id = id,
                                Description = "this is cool",
                                Info = "info"
                            };
                            context.AddUpdateDocument(item, item.Id);
                            id++;
                        }
                        context.SaveChanges();
                        Console.WriteLine("Saved:" + (i + 1) * 10000 + " items");
                    }
                    context.IndexUpdateSettings(new IndexSettings { RefreshInterval = "1s", NumberOfReplicas = 1 });
                }
                catch (Exception ex)
                {
                    Console.WriteLine(ex.ToString());
                }
            }
}

索引模板

因为编程语言的变量类型不一定能和elasticsearch中的类型一致，否则即使数据插入过
去，使用dsl语法进行查询会出现查询异常错误，所以这个时候会用到索引模板的功能，定义最终的字段类型，以下是一个示例模板:
curl http://10.100.157.208:9200/_template/template_2

{  
 "template" : "sqlserver_slowlog*", 
    "settings":{  
        "number_of_shards": 5,  
        "number_of_replicas": 1  
    },  
    "mappings":{  
        "slow_logs":{  
            "properties":{  
                "db_user":{  
                    "type": "keyword"  
                },  
                "app_ip":{  
                     "type": "keyword"  
                },  
                "query_time":{  
                    "type": "float"  
                },
                "timestamp":{  
                    "type": "date",  
                    "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_second"
                },
                "sql":{  
                    "type": "text"
                },
                "db_host":{  
                    "type": "keyword"
                },
                "sql_sample":{  
                    "type": "text"
                }
            }  
        }
    }
}

这样的话，无论应用程序的类型是什么样的，都会根据这种mapping关系对应到elasticsearch中，这种也对查询良好的

学习教程

https://blog.csdn.net/weixin_39800144/column/info/22641

posted @ 2019-08-29 15:48 小泥巴2008 阅读(1098) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

札记