【zookeeper】报错整理 zookeeper Packet len* is out of range
前言
最近发现测试环境 zookeeper
长时间 Full GC
导致服务异常,经过排查后发现,是 zookeeper
服务里面存储的数据量过大,占用 JVM
堆内存过大,导致测试环境基本无法正常使用,于是在调大 JVM heap
内存大小,并重启 zk
后,准备清理一波数据,但是在删除数据时报错。这里整理并记录下。
执行命令
进入zookeeper 后,发现这个节点 执行 ls, stat,都会报错。
ls /BCH_ID_PATH/YJF
报错内容
[zk: 127.0.0.1:2181(CONNECTED) 3] ls /BSFIT_COMMON_ID_PATH/20220418/HCF
2022-06-22 17:51:52,462 [myid:] - WARN [main-SendThread(localhost:2181):ClientCnxn$SendThread@1147] - Session 0xff818a24c32f059d for server localhost/127.0.0.1:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Packet len8623844 is out of range!
at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:112)
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1126)
WATCHER::
WatchedEvent state:Disconnected type:None path:null
Exception in thread "main" org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /BSFIT_COMMON_ID_PATH/20220418/HCF
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1468)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1496)
at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:723)
at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:591)
at org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:363)
at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:323)
at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:282)
root@nt-nrdp-kafka-01.bigdata.com:/root#r
问题原因
客户端单次操作的数据大小,超过了 jute.maxbuffer
限制。
相关代码
protected final ByteBuffer lenBuffer = ByteBuffer.allocateDirect(4);
protected ByteBuffer incomingBuffer = lenBuffer;
protected void readLength() throws IOException {
int len = incomingBuffer.getInt();
if (len < 0 || len >= ClientCnxn.packetLen) {
throw new IOException("Packet len" + len + " is out of range!");
}
incomingBuffer = ByteBuffer.allocate(len);
}
public static final int packetLen = Integer.getInteger("jute.maxbuffer", 4096 * 1024);
解决方案
java
代码
对于 java
代码,只需要添加一个System 环境变量即可。
System.setProperties("jute.maxbuffer", String.valueOf(1000000));
zkCli.sh
shell 客户端下,可以修改客户端启动脚本
$JAVA" "-Dzookeeper.log.dir=${ZOO_LOG_DIR}" "-Dzookeeper.root.logger=${ZOO_LOG4J_PROP}" \
"-Djute.maxbuffer=41943040" \
-cp "$CLASSPATH" $CLIENT_JVMFLAGS $JVMFLAGS \
org.apache.zookeeper.ZooKeeperMain "$@"
- server
zkServer.sh
新增-Djute.maxbuffer
配置,这边以10M为例,具体大小需按实际情况修改(ZOO_USER_CFG为修改过部分的关键词)。
注意
ZooKeeper
是一套高吞吐量的系统,为了提高系统的读取速度,ZooKeeper
不允许从文件中读取需要的数据,而是直接从内存中查找。
还句话说,ZooKeeper集群中每一台服务器都包含全量的数据,并且这些数据都会加载到内存中。同时ZNode的数据并支持Append操作,全部都是Replace。
所以从上面分析可以看出,如果ZNode的过大,那么读写某一个ZNode将造成不确定的延时;同时ZNode过大,将过快地耗尽ZooKeeper服务器的内存。这也是为什么ZooKeeper不适合存储大量的数据的原因。
参考
- https://cloud.tencent.com/developer/article/1516691
- https://blog.csdn.net/liuxinghao/article/details/59071373
- https://blog.csdn.net/zyq_2014/article/details/85948902