logstash 入门及架构介绍
Pipeline
input / filter / output
Input Plugins
- Stdin/File
- Log4j / jdbc / kafka
Output Plugins
将 Event 发送到特定的目的地,是 Pipeline 的最后一个阶段
常见的 Output Plugins
- ElasticSearch
- Kafka
Codec Plugin
将原始数据 decode 成 Event;将 Event encode 成目标数据。
内置的 Codec 插件
- Line / MultipleLIne
- Json / Avro
- Dots / Rubydebug
- Line/json
Filter Plugin
处理 Event
内置的 Filter 插件
- Mutate - 操作 Event
- Metrics - Agregate Metrics
- Ruby - 执行 ruby 代码
Queue

In Memory Queue (进程 Crash、机器宕机会引起数据丢失)
Persistent Queue
示例:
① 读取单行数据,将转换成 event。 点击查看
logstash -e "input{stdin{codec=>json}}output{stdout{codec=>rubydebug}}"
② 读取多行数据
multiline.conf
input {
stdin {
codec => multiline {
pattern => "^\s"
what => "previous"
}
}
}
filter {}
output {
stdout { codec => rubydebug }
}
③ 综合应用
下载 csv 文件 https://grouplens.org/datasets/movielens/
input {
file {
path => "movies.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["id","content","genre"]
}
mutate {
split => { "genre" => "|" }
remove_field => ["path", "host","@timestamp","message"]
}
mutate {
split => ["content", "("]
add_field => { "title" => "%{[content][0]}"}
add_field => { "year" => "%{[content][1]}"}
}
mutate {
convert => {
"year" => "integer"
}
strip => ["title"]
remove_field => ["path", "host","@timestamp","message","content"]
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "movies"
document_id => "%{id}"
}
stdout {}
}
233
浙公网安备 33010602011771号