ElasticSearch安装和简单应用
安装elasticsearch
前置操作
elasticsearch依赖于jdk,而elasticsearch禁止使用root用户启动,如果使用root用户启动会出现。
Exception in thread "main" java.lang.RuntimeException: don't run elasticsearch as root.
at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:93)
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:144)
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:270)
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:35)
Refer to the log for complete error details.
所以需要保证你的集群中安装了jdk且能够被非root用户访问。
检查jdk的路径和权限配置。
上传elasticsearch安装包,解压
tar -xzvf elasticsearch-6.3.1.tar.gz
修改配置文件
修改两个配置文件,一个是elasticsearch的jvm虚拟机配置,另一个是elasticsearch的配置文件。
vim elasticsearch-6.3.1/config/elasticsearch.yml
#找到network并取消注释,把IP地址改为本机IP
network.host: hadoop01
vim elasticsearch-6.3.1/config/jvm.options
#这里是jvm虚拟机的配置,根据实际集群性能配置
-Xms1g
-Xmx1g
修改系统配置
elasticsearch对系统的配置有要求,需要可创建最大文件数为65536,若不配置会报错。
for elasticsearch process is too low, increase to at least [65536]
修改两个系统配置文件。
vim /etc/security/limits.conf
#在END OF FILE前添加
* hard nofile 655360
* soft nofile 131072
* hard nproc 4096
* soft nproc 2048
source /etc/security/limits.conf
这里source会报四个找不到位置之类的,忽视掉。
vim /etc/sysctl.conf
#在文件末尾添加
vm.max_map_count=655360
fs.file-max=655360
sysctl -p
启动elasticsearch
切换到非root用户再启动,可以新建一个用户。进入到elasticsearch的安装目录。
adduser elk
passwd elk
su elk
cd bin/
./elasticsearch
等待日志打印完成后,访问IP:9200端口,会返回一条Json数据,则为启动成功。
http://hadoop01:9200
#返回一条Json数据
{
"name" : "bU3-0bt",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "kw0nLVTDS9ivVvQTL_Q-lQ",
"version" : {
"number" : "6.3.1",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "eb782d0",
"build_date" : "2018-06-29T21:59:26.107521Z",
"build_snapshot" : false,
"lucene_version" : "7.3.1",
"minimum_wire_compatibility_version" : "5.6.0",
"minimum_index_compatibility_version" : "5.0.0"
},
"tagline" : "You Know, for Search"
}
http://hadoop01:9200/_cat/indices?v
#返回一个表格(索引列表),新安装的时候应该为空。
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
安装kibana
解压
为了方便可以把kibana和elasticsearch解压在同一个主目录下。
tar -xzvf kibana-6.3.1-linux-x86_64.tar.gz
一个纯前端的工具。
配置
只有一个配置文件,很简单
vim kibana-6.3.1-linux-x86_64/config/kibana.yml
把这三行的注释取消掉,并且把IP地址改为自己的IP地址
server.host: "hadoop01"
elasticsearch.url: "http://hadoop01:9200"
kibana.index: ".kibana"
启动
切换到kibana的bin目录下,启动kibana,kibana比较慢,启动一般需要一到三分钟。
cd ../bin
nohup ./kibana &
访问http://hadoop01:5601/,访问到kibana既启动成功。
elasticsearch使用
#创建movie索引库
PUT /movie_index
#查询movie库
GET /movie_index
#插入一条数据
PUT movie_index/movie/1
{
"movie_name":"red sea action",
"price":1000.00,
"time":"2018-10-10",
"actors":[
{"name":"zhang han yu","sex":1},{"name":"peng yu yan","sex":0},{"name":"zhang yi","sex":0}]
}
#插入一条数据
PUT movie_index/movie/2
{
"movie_name":"red sea event",
"price":200.00,
"time":"2018-01-05",
"actors":[
{"name":"zhang yu","sex":1},{"name":"zhang san","sex":0},{"name":"zhang yi","sex":0}]
}
#关键字查询
GET /movie_index/movie/_search
{
"query": {
"match": {
"movie_name": "red event"
}
}
}
#删除一条数据
DELETE movie_index/movie/1
#创建movie索引库
PUT /movie_index1
#插入一条数据
PUT movie_index1/movie/1
{
"movie_name":"红海行动 ",
"price":100.00,
"time":"2018-01-05",
"actors":[
{"name":"张涵予","sex":1},{"name":"张三","sex":0},{"name":"张毅","sex":0}],
"movie_business":["1","2","842"]
}
#插入一条数据
PUT movie_index1/movie/2
{
"movie_name":"红海事件",
"price":200.00,
"time":"2018-01-05",
"actors":[
{"name":"张涵予","sex":1},{"name":"李四","sex":0}],
"movie_business":["2","3","843"]
}
#关键字复合查询
GET /movie_index1/movie/_search
{
"query": {
"match": {
"movie_name": "红"
}
}
}
#分词测试
GET _analyze
{
"text":"我是中国人",
"analyzer":"ik_max_word"
}
MySQL数据导入
编写一个Springboot程序进行批量导入。
pom依赖
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
<groupId>org.mybatis.spring.boot</groupId>
<artifactId>mybatis-spring-boot-starter</artifactId>
<version>1.3.4</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
<!-- 通用mapper -->
<dependency>
<groupId>tk.mybatis</groupId>
<artifactId>mapper-spring-boot-starter</artifactId>
<version>1.2.3</version>
<exclusions>
<exclusion>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
<!-- https://mvnrepository.com/artifact/io.searchbox/jest -->
<dependency>
<groupId>io.searchbox</groupId>
<artifactId>jest</artifactId>
<version>5.3.3</version>
</dependency>
<!-- https://mvnrepository.com/artifact/net.java.dev.jna/jna -->
<dependency>
<groupId>net.java.dev.jna</groupId>
<artifactId>jna</artifactId>
<version>4.5.1</version>
</dependency>
</dependencies>
application.properties配置文件
# tomcat端口号
server.port=8080
# 数据源
spring.datasource.url=jdbc:mysql://hadoop01:3306/elasticsearch
spring.datasource.username=root
spring.datasource.password=root
# mybatis
mybatis.mapper-locations=classpath:mapper/*Mapper.xml
mybatis.configuration.map-underscore-to-camel-case=true
# es配置
spring.elasticsearch.jest.uris=http://hadoop01:9200
批量插入方法
@RunWith(SpringRunner.class)
@SpringBootTest
public class DemoApplicationTests {
@Autowired
SkuInfoMapper skuInfoMapper;
@Autowired
JestClient jestClient;
@Test
public void inputTest() throws IOException {
// 查询出mysql中的商品数据
// 封装成java对象
List<SkuInfo> skuInfos = skuInfoMapper.selectAll();
System.out.print(skuInfos.size());
// 打开es链接
// 封装es命令对象
for (SkuInfo skuInfo : skuInfos) {
Index index = new Index.Builder(skuInfo).index("gmall").type("SkuInfo").id(skuInfo.getId()).build();
// 导入es数据
jestClient.execute(index);//es的命令
}
}
}
es没有批量导入的方法,只能通过spring程序,读取mysql数据使用javabean进行封装,然后调用es的api循环逐条插入。这里代码我是直接复制的,一个简单的spring罢了。
安装tomcat和nginx
安装配置tomcat
tomcat下载安装包解压,随便一个版本能用就行。
tar -xzvf apache-tomcat-7.0.47.tar.gz
修改配置文件。
vim apache-tomcat-7.0.47/conf/server.xml
在标签内最后一行添加一个标签,就是重新指定一下网页的资源目录和访问路径。
<Context path = "/manager" docBase = "/usr/elk/project/manager-test" debug = "0" privileged = "true"/>
启动tomcat并,访问hadoop01:8080看到tomcat主页,hadoop01:8080/manager看到配置的网页即为成功。
./apache-tomcat-7.0.47/bin/startup.sh
安装配置nginx
nginx可以直接yum
yum install nginx
systemctl start nginx
此时访问hadoop01为nginx的主页。
修改nginx的配置文件设置反向代理
vim /etc/nginx/conf.d/default.conf
修改location / {}并在server外增加一条
upstream manager {
#上游代理tomcat的8080端口
server localhost:8080 weight=10;
}
server{
#...
location / {
#访问hadoop01:80/的时候转发至代理路径
proxy_pass http://manager:8080;
root /usr/share/nginx/html;
index index.html index.htm;
}
#...
}
重启nginx,访问hadoop01为tomcat主页,
systemctl restart nginx
nginx报错
在实际操作时发现nginx在反向代理后不能正常显示代理页面,反而报错403.
An error occurred.
Sorry, the page you are looking for is currently unavailable.
Please try again later.
If you are the system administrator of this resource then you should check the error log for details.
Faithfully yours, nginx.
查看nginx的日志
cat /var/log/nginx/error.log
[crit] 10478#10478: *1 connect() to 192.168.15.101:8080 failed (13: Permission denied) while connecting to upstream, client: 192.168.15.1, server: localhost, request: "GET / HTTP/1.1", upstream: "http://192.168.15.101:8080/", host: "hadoop01"
报了个权限不足的问题,这个问题查阅后发现可能是SELINUX开启的原因,关闭SELINUX即可。
vim /etc/selinux/config
修改内容为
#SELINUX=enforcing
SELINUX=disabled
重启机器reboot,成功解决。
安装logstash
上传解压安装包
tar -xzvf logstash-6.3.1.tar.gz
修改配置文件,新建一个test.conf
vim logstash-6.3.1/config/test.conf
插入内容
input {
stdin { }
}
output {
stdout {codec=>"rubydebug"}
}
启动logstash,启动略慢,如果报错可以末尾加上-t检查配置文件,未报错则可以在控制台输入内容返回一个ruby字符串。
./logstash-6.3.1/bin/logstash -f logstash-6.3.1/config/test.conf
配置logstash
新的配置文件,读取nginx的日志文件,并且输出到elasticsearch中。
input {
file {
path => ["/var/log/nginx/access.log"]
type => "nginx_access"
#start_position => "beginning"
}
}
filter {
if [type] == "nginx_access" {
grok {
patterns_dir => "/usr/elk/project/patterns/"
match => {
"message" => "%{NGINXACCESS}"
}
}
date {
match => ["timestamp","dd/MMM/YYY:HH:mm:ss Z"]
}
if [param] {
ruby {
init => "@kname = ['quote','url_args']"
code => "
new_event =
LogStash::Event.new(Hash[@kname.zip(event.get('param').split('?'))])
new_event.remove('@timestamp')
event.append(new_event)
"
}
}
if [url_args] {
ruby {
init => "@kname = ['key','value']"
code => "event.set('nested_args',event.get('url_args').split('&').cllect{|i| Hash[@kname.zip(i.split('='))]})"
remove_field => ["url_args","param","quote"]
}
}
mutate {
convert => ["response","integer"]
remove_field => "timestamp"
}
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => ["http://hadoop01:9200"]
index => "logstash-%{type}-%{+YYYY.MM.dd}"
}
}
其中filter中的patterns_dir => "/usr/elk/project/patterns/"内容为nginx的日志解析格式,内容为正则表达式,在该目录下创建一个名为nginx的文件。
NGINXACCESS %{IPORHOST:clientip} %{HTTPDUSER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})" %{NUMBER:response} (?:%{NUMBER:bytes}|-)
NGINXACCESSLOG %{COMMONAPACHELOG} %{QS:referrer} %{QS:agent}
启动logstash
./logstash-6.3.1/bin/logstash -f logstash-6.3.1/config/elasticDemo.conf
然后在你的nginx上代理的网页上随便点一点,就会发现控制台已经打印输出了。
kibana可视化
先获取一下索引列表看看索引是否创建成功。
GET _cat/indices
可以看到一个logstash-nginx_access_[日期]的索引,查找这个索引的内容。
GET logstash-nginx_access-2020.10.13/_search
kibana->ManageMnet->Index Patterns->Create
创建一个新的索引数据,用elastic的索引数据即可。这里以一个手机销售的模拟信息为例。
创建成功后kibana->Visualize-Create选择一个合适的图表模式。
以Pie饼状图为例,点击Pie后选择之前添加的索引数据集。
点击Metrics->Slice Size->选择聚合模式,Count、Max、Sum等常见聚合模式都有。
点击buckets->Split Slices->Filters输入需要查询的字段内容。多Add几个内容。点击Metrics上方的三角运行,可以看见一个简单的图表生成,右上角Save可以保存图表。