[ELK]安装使用

elasticsearch

安装

elasticsearch
0.9 版本
nohup bin/elasticsearch -f &

1.0 版本
bin/elasticsearch -d

 

状态

curl -s http://127.0.0.1:9200/_status?

看历史数据
curl 'http://localhost:9200/_search?pretty'

删除Elasticsearch 数据
# curl -XDELETE 'http://192.168.102.74:9200/logstash-2015.11.18'
清理掉了所有 3月份的索引文件,我发现curl 删除比rm删除要快出很多

curl -XDELETE 'http://192.168.102.74:9200/logstash-2015.12*'
{"acknowledged":true}

查看集群状态

 curl http://127.0.0.1:9200/_cluster/health?pretty
{
  "cluster_name" : "huored",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 3,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 26,
  "active_shards" : 26,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 26,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 50.0
}

 

获取集群节点列表

curl http://127.0.0.1:9200/_cat/nodes?pretty    
10.4.235.3  9 99 2 1.00 0.89 0.71 mdi - es-1-3
10.4.235.5  7 99 1 0.68 0.52 0.45 di  - es-1-5
10.4.235.4  9 99 2 0.76 0.67 0.56 di  - es-1-4
10.4.235.2  8 48 0 0.22 0.25 0.24 mi  * es-1-2
10.4.235.1 10 48 0 0.27 0.21 0.16 mi  - es-1-1

 

获取所有索引  

curl http://127.0.0.1:9200/_cat/indices?pretty
health  status  index           pri             rep  docs.count  docs.deleted    store.size pri.store.size
green open live-2017.07.17     mocPDLuWSJWmiMIn4tAvMw 5 1       1 0  45.2kb  22.6kb
green open live-2017.07.19     xnJLD1xUSdWlWrpn7dST4Q 5 1      13 0   637kb 318.5kb
green open logstash-2015.05.18 PBvyKpt9RASIS82-qiBKpw 5 1       0 0   1.2kb    650b
green open live-2017.07.26     wKPRMU68Rf-NY22Y_xnXFQ 5 1 6374636 0   4.9gb   2.4gb
green open live-2017.07.25     _y1GQ36bQEu3HNI_rqhA6g 5 1      12 0   311kb 155.5kb
green open .kibana             sJezfYMWS2WmI-mf2IE8nA 1 1       1 0   6.3kb   3.1kb
green open live-2017.07.27     PjrsyqsKSKOXdFrTvSfWZg 5 1  371734 0 288.1mb 142.9mb

  

api接口

pyes python调用es接口

import pyes
conn = pyes.es.ES("http://10.xx.xx.xx:8305/")
search = pyes.query.MatchAllQuery().search(bulk_read=1000)
hits = conn.search(search, 'store_v1', 'client', scan=True, scroll="30m", model=lambda _,hit: hit)
for hit in hits:
#print hit
conn.index(hit['_source'], 'store_v2', 'client', hit['_id'], bulk=True)
conn.flush()​

 使用elasticsearch和 dsl增强模块

pip install elasticsearch 
pip install elasticsearch_dsl

  

测试

curl -XPOST http://127.0.0.1:9200/index/weibo/1 -d'
{
"weibo": {
"pubtime":"2011-09-27",
"id": "001",
"content": "ElasticSearch是一个很不错的搜索引擎框架",
"author": {
"reg_time": "2011-09-27",
"name": "medcl"
}
}
}
'

  

curl -XGET http://192.168.102.74:9200/index/weibo/_search?q=pubtime:2011-09-27
{"took":2,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"index","_type":"weibo","_id":"1","_score":1.0, "_source" : 
{
"weibo": {
"pubtime":"2011-09-27",
"id": "001",
"content": "ElasticSearch是一个很不错的搜索引擎框架",
"author": {
"reg_time": "2011-09-27",
"name": "medcl"
}
}
}
}]}}

 

修改 aliase

curl -XPOST 'http://192.168.102.74:9200/logstash-2016.05.10/icg-log/_aliases' -d '
{
"actions": [
{ "remove": {
"alias": "store",
"index": "store_v1"
}},
{ "add": {
"alias": "store",
"index": "store_v2"
}}
]
}
'

  

获取索引 mapping 信息
curl -XGET 'http://192.168.102.74:9200/logstash-2016.05.10/icg-log/_mapping_v2' 

 

kibana

kibana会自动在渲染的时候根据你浏览器所在时区换算成当前所在时区的时间
界面上时间会获取本地系统的时间

注意: nignx做kibana负载,可能会打不开,需要使用ip_hash源地址

 

增加登录账号控制


1 配置nginx密码:nginx可以为网站或目录甚至特定的文件设置密码认证。密码必须是crypt加密的。可以用apache的htpasswd来创建密码。
格式为:htpasswd -b -c site_pass username password

upstream etl_server {
server 127.0.0.1:9200;
}

server { 
listen 80; 
server_name 192.168.102.74; 
location / { 
#你的Kibana地址。 
root /var/www/html/; 
index index.html index.htm; 
auth_basic "Restricted"; 
#你的nginx密码文件地址 
auth_basic_user_file /usr/local/nginx/conf/site_pass; 
}

location ~ ^/_aliases$ { 
proxy_pass http://etl_server; 
proxy_read_timeout 90; 
} 
location ~ ^/.*/_aliases$ { 
proxy_pass http://etl_server; 
proxy_read_timeout 90; 
} 
location ~ ^/_nodes$ { 
proxy_pass http://etl_server; 
proxy_read_timeout 90; 
} 
location ~ ^/.*/_search$ { 
proxy_pass http://etl_server; 
proxy_read_timeout 90; 
} 
location ~ ^/.*/_mapping$ { 
proxy_pass http://etl_server; 
proxy_read_timeout 90; 
}

# Password protected end points 
location ~ ^/kibana-int/dashboard/.*$ { 
proxy_pass http://etl_server; 
proxy_read_timeout 90; 
limit_except GET { 
proxy_pass http://etl_server; 
auth_basic "Restricted"; 
#你的nginx密码文件地址,如果需要在保存panel时新加认证,多生成一份不用账号密码的文件即可。 
auth_basic_user_file /usr/local/nginx/conf/site_pass_nopass; 
} 
} 
}

location ~ ^/kibana-int/temp.*$ { 
proxy_pass http://etl_server; 
proxy_read_timeout 90; 
limit_except GET { 
proxy_pass http://etl_server; 
auth_basic "Restricted"; 
#你的nginx密码文件地址,如果需要在保存panel时新加认证,多生成一份不用账号密码的文件即可。 
auth_basic_user_file /usr/local/nginx/conf/site_pass_nopass; 
} 
}

  

2、修改kibana下的config.js,将elasticsearch: "http://"+window.location.hostname+":9200", (注意,号)
修改为elasticsearch: "http://"+window.location.hostname+":34707",即可,因为公网开的就是端口

3、http默认不启用gzip 配置启用gzip
config.js里面配置
http.compression: true
kibana luence 查询语法
使用双引号包起来作为一个短语搜索
"like Gecko"

也可以按页面左侧显示的字段搜索
限定字段全文搜索: field:value
精确搜索:关键字加上双引号 filed:"value"
http.code:404 搜索http状态码为404的文档

字段本身是否存在
_exists_:http :返回结果中需要有http字段
_missing_:http :不能含有http字段

通配符
? 匹配单个字符
* 匹配0到多个字符
kiba?a , el*search
? * 不能用作第一个字符,例如: ?text *text

正则
es支持部分 正则 功能
mesg:/mes{2}ages?/

模糊搜索
~ :在一个单词后面加上 ~ 启用模糊搜索
first~ 也能匹配到 frist
还可以指定需要多少相似度
cromm~0.3 会匹配到 from 和 chrome
数值范围0.0 ~ 1.0,默认0.5,越大越接近搜索的原始值


范围搜索
数值和时间类型的字段可以对某一范围进行查询
length:[100 TO 200]
date:{"now-6h" TO "now"}
[ ] 表示端点数值包含在范围内,{ } 表示端点数值不包含在范围内


转义特殊字符
+ - && || ! () {} [] ^"" ~ * ? : \
以上字符当作值搜索的时候需要用 \ 转义

 

lucene查询语法

request:qrCode.htm   AND remote_addr:115.192.38.124​

必须符合就用+,否就用 - 
+type:"nginx_waf" -attack_method:"White_IP"

//查询这个域名并且请求延迟超过10s的
server_name: worker.caihongshop.com AND upstream_response_time: [10 TO *]

  

 

logstash

启动

后台执行
#bin/logstash -f nginx_access.conf -t # 无误后启动
#bin/logstash -f nginx_access.conf --verbose # 要检查错误 --debug
#bin/logstash -f nginx_access.conf &

输入插件input

input {
 file {
 type => "syslog"
 path => ["/usr/local/logstash/my.log"]
 }
 }

 input {
 file {
 type => "syslog"
 path => ["/var/log/secure","/var/log/messages"]
 exclude => ["*.gz","shipper.log"]
 sincedb_path => "/dev/null" -- 脚本启动必须加这个
 }
 }
 file {
 type => "icg-log"
 path => ["/usr/local/logstash/icg.log"]
 start_position => "beginning"
 sincedb_path => "/dev/null" // beginning 必须加这个
 }
 
 
 input {
 file {
 type=>"xx_server_log"
 path=>"/opt/software/apache-tomcat-7.0.59/logs/catalina.out"
 codec=> multiline {
 pattern => "(^.+Exception:.+)|(^\s+at .+)|(^\s+... \d+ more)|(^\s*Caused by:.+)"
 what=> "previous"
 }
}
 }​

  

 tomcat下的log4j 输出

log4j.rootLogger=DEBUG, logstash 
 ###SocketAppender###
 log4j.appender.logstash=org.apache.log4j.net.SocketAppender
 log4j.appender.logstash.Port=4560
 log4j.appender.logstash.RemoteHost=127.0.0.1
 log4j.appender.logstash.ReconnectionDelay=60000
 log4j.appender.logstash.LocationInfo=true​

  
正则匹配

2017-08-28 16:40:56,739 - {"type":"WEB","url":"http://api.jczj123.com:80/client/service.json","accessStart":1503909656677,"accessEnd":1503909656739,"accessCost":62,"ip":"112.5.237.76","method":"POST","operator":"-2","headers":{"appName":"jczj-iphone","appUserAgent":"IOS_VERSION9.300000SCREEN_HEIGHT0","appVersion":"1.1.8","user-agent":"%E7%AB%9E%E5%BD%A9%E4%B9%8B%E5%AE%B6/1.1.8.0 CFNetwork/758.5.3 Darwin/15.6.0"},"parameters":{"sign":["56d771b05eebf2e11b30a984e3223228"],"loginKey":["JCZJ_LOGIN20170210222417"],"orderBy":["SET_TOP_AND_GMT_DESC"],"objectId":["98012017082038715931"],"service":["COMMENT_QUERY"],"currentPage":["1"],"objectType":["LOTTERY_PROJECT"],"authedUserId":["8201610261214815"]},"logStart":1503909656739,"logEnd":1503909656739,"logCost":0}

  

过滤器插件 Filter

数据格式:
{
"message" => "dfg",
"@version" => "1",
"@timestamp" => "2015-11-19T02:02:11.602Z",
"host" => "nlkf1"
}

 

filter 
{ 
if [type] == "TmpLog" { # if [foo] in ["hello", "world", "foo"] 
mutate { 
replace => { "type" => "apache_access" } # mutation:变异,即更改字段(rename,update,replace,split ...) 
split => ["message",":"] 
} 
grok { # 最主要的解析器 
patterns_dir => ["/home/logtools/logstash-1.4.2/mypatterns"] # 指定解析patterns (以正则表达式为主) 
match => { "message" => "%{UserOnOffLog}" } 
} 
alter { # 更改字段(按官方的说法以后可能会合并到mutate) 
condrewrite => [ "host", "%{host}", "10.0.0.139" ] # 假如内容为期望值,则变更字段内容 ["field_name", "expected_value", "new_value"] 
} 
date { # 解析日期 
match => [ "create_time" , "yyyy/MM/dd HH:mm:ss" ] 
} 
multiline { # 多行合并成一个事件,e.g. java stack traces 
type => "somefiletype" 
pattern => "^\s" # 空格开头的行 
what => "previous" # 和之前的行合并 
} 
} 
}

 

匹配grok

在线文档 

https://doc.yonyoucloud.com/doc/logstash-best-practice-cn/filter/grok.html

 

%{PATTERN_NAME:capture_name:data_type}
小贴士:data_type 目前只支持两个值:int 和 float
filter {
grok {
match => {
"message" => "%{WORD} %{NUMBER:request_time:float} %{WORD}"
}
}
}

"message" => "%{WORD}<result>%{NUMBER:result:int}<%{WORD}" 
"<result>%{NUMBER:result:int}</result>"

-- cata1ine 日志解析
https://segmentfault.com/q/1010000003801260

SERVER_LOG %{DATA:year}-%{DATA:month}-%{DATA:day}\ %{DATA:hour}\:%{DATA:min}\:%{DATA:sec}\ %{DATA:level}\ %{DATA:class} -{ip:%{DATA:ip},url:%{DATA:url},param:%{DATA:param},return:%{DATA:return},cost:%{BASE10NUM:cost}

 

  

用于过滤一些条件

filter {
        if [type] =~ "warehouse" {
             grok {
             match => {
             "message" => "\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},\d{3}\s-\s(?<wdata>.+)"
             }
             remove_field => ['message']
            }
         }
        json {
                source => "nginx_access,nginx_waf"
                target => "jsoncontent"
        }
}

  

在线正则匹配
http://grokdebug.herokuapp.com/
正则匹配
http://xiaorui.cc/2015/01/27/logstash%E4%BD%BF%E7%94%A8grok%E6%AD%A3%E5%88%99%E8%A7%A3%E6%9E%90%E6%97%A5%E5%BF%97%E9%81%87%E5%88%B0%E7%9A%84%E9%97%AE%E9%A2%98/


小贴士:data_type 目前只支持两个值:int 和 float
<function>sendCTD</function>

grok {
patterns_dir => ["/usr/local/logstash/patterns"]
match => { "message" => "%{NGINX_ACCESS_LOG}" }
}

根据自带的可以自定义解析
# NGINX_ACCESS_LOG
NGINX_ACCESS_LOG \[%{IP:real_ip}\] \[(?<remote_user>.+?)\] \[%{HTTPDATE:httpdate}\] \[(?<request>.+?)\] \[%{INT:status}\] \[%{INT:body_bytes_sent}\] \[(?<http_referer>.+?)\] \[(?<http_user_agent>.+?)\] \[%{IP:x_forward}\]

\.+\s-\s\[?<message>\.+?\]

 

filter {
        if [type] =~ "warehouse" {
        grok {
             match => {
             "message" => "(?<dttime>\d{4}-\d{2}-\d{2}\s\d{2}:\d{2}:\d{2},)"
             match => { "path" => "%{GREEDYDATA}/%{GREEDYDATA:app}.access.log" }
             # grok breaks on match by default. So the first match being good, it skips the second one.
             match => { "path" => "%{GREEDYDATA}/%{GREEDYDATA:app}.access.log" }
             }
        }

        mutate {
             gsub => [ "dttime"," ","T" ]
             gsub => [ "dttime",",","+08:00" ]
        }

        }
}

=============================================

grok {
	match => {
	"path" => "/opt/docker_logs/(?<hostname>.+-\d{1,2}-\d{1,2})/.+.log"
	}
}

  

替换gsub

# replace all commas with dots
"logTimestampString", ",", "."
mutate {
gsub => [ "server_name","\.","_" ]
}

  

地区匹配geoip

geoip {
source => "client_ip"
database => "/usr/local/logstash/vendor/geoip/GeoLiteCity.dat"
fields => [ "city_name","country_name","region_name","real_region_name","country_code2"]
remove_field => [ "[geoip][latitude]", "[geoip][longitude]"]
}

kibana 全球地图里面 需要 country_code2

  

输出ouput

output {
 stdout{
 codec=>rubydebug
 }
 redis {
 host => "192.168.102.74:6380"
 data_type => "list"
 key => "logstash:zgt"
 }
 }​

  

input {
        file {
                path => "/usr/local/tengine/logs/access.log"
                type => "nginx_access"
                codec => json
                sincedb_path => "/dev/null"
        }
        file {
                path => "/usr/local/tengine/logs/error.log"
                type => "nginx_error"
                sincedb_path => "/dev/null"
        }

        file {
                path => "/usr/local/tengine/logs/waf.log"
                type => "nginx_waf"
                codec => json
                sincedb_path => "/dev/null"
        }
        file {
                path => "/var/log/messages"
                type => "syslog"
                sincedb_path => "/dev/null"
        }

}


filter {
        json {
                source => "nginx_access,nginx_waf"
                target => "jsoncontent"
        }

}


output {
        redis {
                host => "10.4.230.2"
                port => "6379"
                data_type => "list"
                key => "logstash-redis"
        }



        if [type] == 'nginx_access' and [host] =~ 'webpd-1-*' {
                kafka {
                bootstrap_servers => "DT-WH-1-3:9092,DT-WH-1-4:9092,DT-WH-1-5:9092"
                topic_id => "jczj_log"
                codec => "json"
                }
        }

#       stdout {
#               codec =>rubydebug
#       }
#       file {
#               path =>"/tmp/logsatsh.log"
#               codec => plain {
#                       charset =>GBK
#               }
#       }


}

  

测试

bin/logstash -e 'input{stdin{}}output{stdout{codec=>rubydebug}}'

bin/logstash -e 'input{stdin{}}output{stdout{codec=>rubydebug} redis{host=>"192.168.102.74:6380" data_type=>"list" key=>"zgt"}}'

bin/logstash -e 'input{file{type=>"system" path=>["/usr/local/logstash/my.log"]}}output{stdout{codec=>rubydebug} redis{host=>"192.168.102.74:6380" data_type=>"list" key=>"zgt"}}' 

bin/logstash -e 'input{stdin{}}filter{grok{match=>{"message"=>"%{WORD}<result>%{NUMBER:result:int}<%{WORD}"}}}output{stdout{codec=>rubydebug}}'


# bin/logstash -e 'input{stdin{}}filter{grok{match=>{"message"=>"%{WORD} %{NUMBER:request_time:float} %{WORD}"}}}output{stdout{codec=>rubydebug}}'
begin 20120909 end
{
"message" => "begin 20120909 end",
"@version" => "1",
"@timestamp" => "2015-11-19T02:09:32.109Z",
"host" => "nlkf1",
"request_time" => 20120909.0
}

 

filebeat

安装

https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-5.2.2-linux-x86_64.tar.gz

  

1s可以达到2k+以上,  logstash 6+只能到300,性能差很多

filebeat.prospectors:
- input_type: log
  tags: ["nginx_log"]
  paths:
    - /usr/local/nginx/logs/*.log
  json.keys_under_root: true  //直接读取json格式的内容 到根,不然话就在message里面
  json.overwrite_keys: true
  tail_files: true

- input_type: log
  tags: ["app_log"]
  paths:
    - /home/admin/*-output/logs/*-*-*/*.log
  encoding: gbk  // 字符集
  multiline:  // 多行匹配
   pattern: '^\d{4}\-\d{2}\-\d{2}\s\d+\:\d+\:\d+\,\d+'
   negate: true
   match: after

输入到redis
output.redis:
  enabled: true
  hosts: ["10.4.234.137:6379"]
  key: log-filebeat
  db: 0
  datatype: list
  worker: 1
  timeout: 5s

输入到es
output.elasticsearch:
  hosts: ["xxx.xxx.xxx.xxx:9200"]
  index: "rainbow-%{+YYYY.MM.dd}"

  

晶晶提供的配置 

filebeat.inputs:

- input_type: log
  paths:
    - /home/recognizebasic/logs/root.log
  fields:
    log_source: prod_tomcat
    app: basic

  multiline:
                pattern: ^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3}
                negate: true
                match: after

- input_type: log
  paths:
    - /home/recognizecore/logs/root.log
  fields:
    log_source: prod_tomcat
    app: core

  multiline:
                pattern: ^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3}
                negate: true
                match: after

  

posted @ 2019-04-28 16:28 richardzgt 阅读(...) 评论(...)  编辑 收藏