为了统计收集日志信息,常用的方式就是syslog-ng,比如收集nginx日志,可以在部署有nginx的机器上,配置syslog-ng,通过pipe收集本地日志,并通过udp发送到目标机器,目标机器可以配置将文件收集到本地,
也可以通过flume将日志收集到kafka。
1 syslog-ng简单使用介绍
一个示例如下:
1.1 收集日志的源机器
syslog-ng.conf配置如下:
@version: 3.5
@include "scl.conf"
source s_local {
system();
internal();
};
source s_network {
udp();
};
destination d_local {
file("/var/log/messages");
};
source s_nginx { pipe ("/opt/nginx/logs/access.pipe" ); };
destination d1_nginx { udp("xx1.xx1.xx1.xx1" port(601) ); };
destination d2_nginx { udp("xx2.xx2.xx2.xx2" port(9484) ); };
log {
source(s_nginx);
destination(d1_nginx);
destination(d2_nginx);
};
上面的示例就是将nginx的日志收集并通过udp端口发送到xx1.xx1.xx1.xx1的601端口和xx2.xx2.xx2.xx2的9484端口。然后通过在xx1.xx1.xx1.xx1上配置syslog-ng,将日志收集到本地磁盘
1.2 收集本地磁盘的目标机器
syslog-ng.conf配置如下:
@version: 3.5
@include "scl.conf"
options {
chain_hostnames(off);
flush_lines(1);
};
source s_local {
system();
internal();
};
source s_network {
udp(ip(0.0.0.0) port(601));
};
destination d_network {
file("/data10/nginx-log/nginx.log.$YEAR-$MONTH-$DAY" owner("real") group("real") perm(0644) create_dirs(yes));
};
log {
source(s_network);
destination(d_network);
};
1.3 收集kafka的目标机器
为了将日志收集到kafka,以便可以通过flume对日志进行实时处理分析。我们可以再xx2.xx2.xx2.xx2上部署flume。
flume.properties
nginx.sources = realtime-syslog
nginx.sinks = realtime-sink
nginx.channels = mem
# Describe/configure the source
nginx.sources.realtime-syslog.type = syslogudp
nginx.sources.realtime-syslog.port = 9484
nginx.sources.realtime-syslog.host = xx2.xx2.xx2.xx2 所在机器ip
nginx.sources.realtime-syslog.channels = mem
# configure the destination
nginx.sinks.realtime-sink.type = org.apache.flume.plugins.KafkaSink
nginx.sinks.realtime-sink.metadata.broker.list = ip1:port,ip2:port,ip3:port (kafka机器列表)
nginx.sinks.realtime-sink.partition.key = 0
nginx.sinks.realtime-sink.partitioner.class = org.apache.flume.plugins.SinglePartition
nginx.sinks.realtime-sink.serializer.class = kafka.serializer.StringEncoder
nginx.sinks.realtime-sink.request.required.acks = 0
nginx.sinks.realtime-sink.max.message.size = 1000000
nginx.sinks.realtime-sink.producer.type = async
nginx.sinks.realtime-sink.custom.encoding = UTF-8
nginx.sinks.realtime-sink.custom.topic.name = topic (收集到kafka的topic)
nginx.channels.mem.type = memory
nginx.channels.mem.capacity = 800000
nginx.channels.mem.transactionCapacity = 800000
nginx.channels.mem.byteCapacity = 200000000
# Bind the source and sink to the channel
nginx.sinks.realtime-sink.channel =mem
2 syslog-ng编译
首先下载eventlog
(地址:http://apache.bjcnc.scs.sohucs.com/eventlog-0.2.12.tar.gz https://my.balabit.com/downloads/eventlog/0.2/eventlog_0.2.12.tar.gz )
假设系统是redhat或centos系列,首先需要确保glib2库已经安装。(可以用下面命令安装,或者直接用yum groupinstall "Development tools" 安装常用开发工具及库)
yum -y install glib2-devel
1. 编译eventlog
tar xzf eventlog-0.2.12.tar.gz
cd eventlog-0.2.12
./configure --prefix=/usr/local/syslog-ng-related/eventlog
make
make install
2. 编译syslog-ng
下载syslog-ng (地址:http://apache.bjcnc.scs.sohucs.com/syslog-ng-3.5.4.1.tar.gz 或 https://syslog-ng.org/ syslog官网)
tar xzf syslog-ng-3.5.4.1.tar.gz
cd syslog-ng-3.5.4.1
export PKG_CONFIG_PATH=/usr/local/syslog-ng-related/eventlog/lib/pkgconfig
./configure --prefix=/usr/local/syslog-ng-related
make
make install
2 syslog-ng with kafka编译
假设jdk安装在/opt/apps/jdk目录,则需要export LD_LIBRARY_PATH=/opt/apps/jdk/jre/lib/amd64/server:$LD_LIBRARY_PATH
或者直接将/opt/apps/jdk/jre/lib/amd64/server配置到/etc/ld.so.conf中,并执行ldconfig
下载最新的syslog-ng 3.7.3版本,为了支持kafka等java程序,编译时需要使用gradle。
下载gradle,将其解压到/opt/apps目录,并配置gradle相关环境变量
export GRADLE_HOME=/opt/apps/gradle-2.2
export PATH=$JAVA_HOME/bin:$GRADLE_HOME/bin:$PATH
创建/usr/local/syslog-ng-related/kafka 目录,将kafka lib下所有jar包放在/usr/local/syslog-ng-related/kafka目录下。比如
kafka_2.10-0.8.2.1.jar kafka_2.10-0.8.2.1-scaladoc.jar kafka_2.10-0.8.2.1-test.jar slf4j-api-1.7.6.jar
kafka_2.10-0.8.2.1-javadoc.jar kafka_2.10-0.8.2.1-sources.jar kafka-clients-0.8.2.1.jar slf4j-log4j12-1.7.6.jar
编译syslog-ng
./configure --disable-python --prefix=/usr/local/syslog-ng-related/syslog-ng
make && make install
在使用syslog-ng with kafka时,此时syslog-ng.conf配置文件如下
@version: 3.7
@module mod-java
@include "scl.conf"
source s_local {
system();
internal();
};
source s_nginx { pipe ("/opt/nginx/logs/access.pipe" ); };
destination d_kafka {
java(
class_path("/usr/local/syslog-ng-related/syslog-ng/lib/syslog-ng/java-modules/*.jar:/usr/local/syslog-ng-related/kafka/*.jar")
class-name("org.syslog_ng.kafka.KafkaDestination")
option("kafka-bootstrap-servers", "ip1:port,ip2:port,ip3:port")
option("topic", "nginx_topic")
#option("template", "$(format-json @timestamp=\"$ISODATE\" logsource=\"$HOST\" program=\"$PROGRAM\" message=\"$MESSAGE\" app=\"something\")")
option("template", "$HOST $MESSAGE")
);
};
destination d_udp { udp("xx2.xx2.xx2.xx2" port(601) ); };
log {
source(s_nginx);
destination(d_kafka);
destination(d_udp);
};
syslog-ng 启动相关命令
/etc/init.d/syslog-ng restart
/etc/init.d/syslog-ng start
/etc/init.d/syslog-ng stop
/etc/init.d/syslog-ng 脚本如下:
#!/bin/bash
syslogngd=/usr/local/syslog-ng-related/syslog-ng/sbin/syslog-ng
syslogng_pid=/usr/local/syslog-ng-related/syslog-ng/var/syslog-ng.pid
RETVAL=0
prog="syslog-ng"
# Source function library.
. /etc/rc.d/init.d/functions
[ -x $syslogngd ] || exit 0
# Start syslog-ng daemons functions.
start() {
if [ -e $syslogng_pid ];then
echo "syslog-ng already running...."
exit 1
fi
echo -n $"Starting $prog: "
daemon $syslogngd
RETVAL=$?
echo
[ $RETVAL = 0 ] && touch /var/lock/subsys/syslog-ng
return $RETVAL
}
# Stop syslog-ng daemons functions.
stop() {
echo -n $"Stopping $prog: "
killproc $syslogngd
RETVAL=$?
echo
[ $RETVAL = 0 ] && rm -f /var/lock/subsys/syslog-ng /usr/local/syslog-ng-related/syslog-ng/var/syslog-ng.pid
}
reload() {
echo -n $"Reloading $prog: "
killproc $syslogngd -HUP
RETVAL=$?
echo
}
# See how we were called.
case "$1" in
start)
start
;;
stop)
stop
;;
reload)
reload
;;
restart)
stop
start
;;
status)
status $prog
RETVAL=$?
;;
*)
echo $"Usage: $prog {start|stop|restart|reload|status|help}"
exit 1
esac
exit $RETVAL