|NO.Z.00089|——————————|BigDataEnd|——|Hadoop&kafka.V03|——|kafka.v03|监控度量指标|
一、集群监控:监控度量指标
### --- 监控度量指标
~~~     Kafka使用Yammer Metrics在服务器和Scala客户端中报告指标。
~~~     Java客户端使用Kafka Metrics,它是一个内置的度量标准注册表,
~~~     可最大程度地减少拉入客户端应用程序的传递依赖项。
~~~     两者都通过JMX公开指标,并且可以配置为使用可插拔的统计报告器报告统计信息,
~~~     以连接到您的监视系统。具体的监控指标可以查看官方文档。### --- JMX:Kafka开启Jmx端口
~~~     # 所有节点开启JMX_PORT端口:所有节点
[root@hadoop01 ~]# vim /opt/yanqi/servers/kafka_ms/bin/kafka-server-start.sh
~~~第一行加上下列参数即可
export JMX_PORT=9581### --- 启动kafka集群
~~~     所有kafka机器添加一个JMX_PORT ,并重启kafka
[root@hadoop01 ~]# kafka-server-start.sh -daemon /opt/yanqi/servers/kafka_ms/config/server.properties### --- 验证JMX开启
~~~     首先打印9581端口占用的进程信息,然后使用进程编号对应到Kafka的进程号,搞定。
~~~     # 查看hadoop01
~~~     对应的kafka的pid是pid=4127
[root@hadoop01 ~]# ss -nelp | grep 9581
tcp    LISTEN     0      50       :::9581                 :::*                   users:(("java",pid=4127,fd=78)) ino:39145 sk:ffff986bf87f2200 v6only:0 <->~~~     # 9581端口启动完成
~~~     kafka的pid和9581端口对应的pid值一致,说明监控端口成功连接
[root@hadoop01 ~]# jps
4127 Kafka~~~     # 查看hadoop02
[root@hadoop02 ~]# ss -nelp | grep 9581
tcp    LISTEN     0      50       :::9581                 :::*                   users:(("java",pid=8927,fd=78)) ino:55599 sk:ffff9fe139100840 v6only:0 <->
[root@hadoop02 ~]# jps
8927 Kafka~~~     # 查看hadoop03
~~~     也可以查看Kafka启动日志,确定启动参数 -Dcom.sun.management.jmxremote.port=9581 存在即可
[root@hadoop03 ~]# ss -nelp | grep 9581
tcp    LISTEN     0      50       :::9581                 :::*                   users:(("java",pid=9250,fd=78)) ino:55666 sk:ffff8cd9790da100 v6only:0 <->
[root@hadoop03 ~]# jps
9250 Kafka 二、使用JConsole链接JMX端口
### --- 准备监控主题
[root@hadoop01 ~]# kafka-topics.sh --zookeeper localhost:2181/myKafka \
--create --topic topic_x --partitions 3 --replication-factor 2
 [root@hadoop01 ~]# kafka-topics.sh --zookeeper localhost:2181/myKafka \
--create --topic topic_y --partitions 3 --replication-factor 2
[root@hadoop01 ~]# kafka-topics.sh --zookeeper localhost:2181/myKafka \
--create --topic topic_z --partitions 3 --replication-factor 2
 
[root@hadoop01 ~]# kafka-topics.sh --zookeeper localhost:2181/myKafka --list
topic_x
topic_y
topic_z### --- win/mac找到jconsole工具并打开, 
~~~     在${JAVA_HOEM}/bin/ Mac电脑可以直接命令行输入jconsole
~~~     ——>win+R——>cmd——>jconsole
~~~     ——>安全连接失败,是否以不安全的方式重试:选择不安全的连接


### --- 和在JMS上查看到的在Hadoop节点上有2个分区;0号分区和1号分区
[root@hadoop01 ~]#  kafka-topics.sh --zookeeper localhost:2181/myKafka \
--describe --topic topic_x
Topic:topic_x   PartitionCount:3    ReplicationFactor:2 Configs:
    Topic: topic_x  Partition: 0    Leader: 0   Replicas: 0,2   Isr: 0,2
    Topic: topic_x  Partition: 1    Leader: 1   Replicas: 1,0   Isr: 1,0
    Topic: topic_x  Partition: 2    Leader: 2   Replicas: 2,1   Isr: 2,1三、详细的监控指标
### --- 详细的监控指标:
~~~     相见官方文档:http://kafka.apache.org/10/documentation.html#monitoring
~~~     这里列出常用的:OS监控项| obJectName | 指标项 | 说明 | 
| java.lang:type=OperatingSystem | FreePhysicalMemorySize | 空闲物理内存 | 
| java.lang:type=OperatingSystem | SystemCpuLoad | 系统CPU利用率 | 
| java.lang:type=OperatingSystem | ProcessCpuLoad | 进程CPU利用率 | 
| java.lang:type=GarbageCollector, name=G1 YoungGeneration | CollectionCount | GC次数 | 
四、broker指标
| objectName | 指标项 | 说明 | 
| kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec | Count | 每秒输入的流量 | 
| kafka.server:type=BrokerTopicMetrics,name=BytesOutPerSec | Count | 每秒输出的流量 | 
| kafka.server:type=BrokerTopicMetrics,name=BytesRejectedPerSec | Count | 每秒扔掉的流量 | 
| kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec | Count | 每秒的消息写入总量 | 
| kafka.server:type=BrokerTopicMetrics,name=FailedFetchRequestsPerSec | Count | 当前机器每秒fetch 请求失败的数量 | 
| kafka.server:type=BrokerTopicMetrics,name=FailedProduceRequestsPerSec | Count | 当前机器每produce 请求失败的数量 | 
| kafka.server:type=ReplicaManager,name=PartitionCount | Value | 该broker上partition 的数量 | 
| kafka.server:type=ReplicaManager,name=LeaderCount | Value | Leader的replica数量 | 
| kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchConsumer | Count | 一个FetchConsumer 耗费的所有时间 | 
| kafka.network:type=RequestMetrics,name=TotalTimeMs,request=FetchFollower | Count | 一个FetchFollower 耗费的所有时间 | 
| kafka.network:type=RequestMetrics,name=TotalTimeMs,request=Produce | Count | 一个请求Produce 耗费的所有时间 | 
五、producer以及topic指标
| obJectName | 指标项 | 官网说明 | 译文说明 | 
| kafka.producer:type=producermetrics, client-id=consoleproducer( client-id会变化) | incomingbyte- rate | The average number of incoming bytes received per second from all servers. | producer每秒的平均写入流量 | 
| kafka.producer:type=producermetrics, client-id=consoleproducer( client-id会变化) | outgoingbyte- rate | The average number of outgoing bytes sent per second to all servers. | producer每秒的输出流量 | 
| kafka.producer:type=producermetrics, client-id=consoleproducer( client-id会变化) | requestrate | The average number of requests sent per second to the broker. | producer每秒发给broker的平均request次数 | 
| kafka.producer:type=producermetrics, client-id=consoleproducer( client-id会变化) | responserate | The average number of responses received per second from the broker. | producer每秒发给broker的平均response次数 | 
| kafka.producer:type=producermetrics, client-id=consoleproducer( client-id会变化) | requestlatencyavg | The average time taken for a fetch request. | 一个fetch请求的平均时间 | 
| kafka.producer:type=producer-topicmetrics,client-id=consoleproducer,topic=testjmx(client-id和topic名称会变化) | recordsend- rate | The average number of records sent per second for a topic. | 每秒从topic发送的平均记录数 | 
| kafka.producer:type=producer-topicmetrics,client-id=consoleproducer, topic=testjmx(client-id和topic名称会变化) | recordretry- total | The total number of retried record sends | 重试发送的消息总数量 | 
| kafka.producer:type=producer-topicmetrics,client-id=consoleproducer, topic=testjmx(client-id和topic名称会变化) | recorderrortotal | The total number of record sends that resulted in errors | 发送错误的消息总数量 | 
六、consumer指标
| obJectName | 指标项 | 官网说明 | 说明 | 
| kafka.consumer:type=consumerfetch-manager-metrics,clientid= consumer-1(client-id会变化) | recordslag-max | Number of messages the consumer lags behind the producer by. Published by the consumer, not broker. | 由consumer 提交的消息消费lag | 
| kafka.consumer:type=consumerfetch-manager-metrics,clientid=consumer-1(client-id会变化) | recordsconsumedrate | The average number of records consumed per second | 每秒平均消费的消息数量 | 
Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart
                                                                                                                                                   ——W.S.Landor
 
                    
                     
                    
                 
                    
                 
                
            
         浙公网安备 33010602011771号
浙公网安备 33010602011771号 
