kafka

目录

1.某个kafka节点无法启动,pod状态为Running,0/1,日志报错Failed to load record batch at position 292543977 from FileRecords

1.某个kafka节点无法启动,pod状态为Running,0/1,日志报错Failed to load record batch at position 292543977 from FileRecords

该问题是由于该kafka节点的某个topic分区的文件损坏导致,删除该损坏文件并重启服务即可

  1. 将异常kafka节点所在的节点设置为不可调度,假设这里是controller-72

    [root@controller-71 ~]# kubectl cordon controller-72
    node/controller-72 cordoned
    
  2. 删除异常的kafka节点,删除后该kafka节点无满足条件的节点可供调度,该pod状态为Pending,此时该pod无法读写所在节点的数据文件

    [root@controller-71 ~]# kubectl -n component delete pod kakfa-default-0
    
  3. 登录该异常kafka节点所在的节点,找到并删除损坏的分区日志文件

    [root@controller-71 ~]# ssh root@controller-72
    [root@controller-72 ~]# find /mnt/locals/kafka/ -name 00000000000021269865.log -exec rm -rf {} \;
    
  4. 取消不可调度,让异常的kafka节点调度到节点上并运行

    [root@controller-71 ~]# kubectl uncordon controller-72
    
  5. 检查所有kafka节点是否都已正常运行,pod状态为Running,且1/1

下面为异常kafka节点的日志报错的关键部分

[2022-06-15 01:54:31,668] ERROR There was an error in one of the threads during logs loading: org.apache.kafka.common.KafkaException: Failed to load record batch at position 292543977 from FileRecords(file= /var/lib/kafka1/disk00/ test_topic-0/00000000000021269865.log, start=0, end=2147483647) (kafka.log.LogManager)
[2022-06-15 01:54:31,674] ERROR [KafkaServer id=0] Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer)
org.apache.kafka.common.KafkaException: Failed to load record batch at position 292543977 from FileRecords(file= /var/lib/kafka1/disk00/ test_topic-0/00000000000021269865.log, start=0, end=2147483647)
	at org.apache.kafka.common.record.FileLogInputStream$FileChannelRecordBatch.loadBatchWithSize(FileLogInputStream.java:217)
	at org.apache.kafka.common.record.FileLogInputStream$FileChannelRecordBatch.loadFullBatch(FileLogInputStream.java:194)
	at org.apache.kafka.common.record.FileLogInputStream$FileChannelRecordBatch.ensureValid(FileLogInputStream.java:166)
	at kafka.log.LogSegment.$anonfun$recover$1(LogSegment.scala:343)
	at kafka.log.LogSegment.$anonfun$recover$1$adapted(LogSegment.scala:342)
	at scala.collection.Iterator.foreach(Iterator.scala:941)
	at scala.collection.Iterator.foreach$(Iterator.scala:941)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
	at kafka.log.LogSegment.recover(LogSegment.scala:342)
	at kafka.log.Log.recoverSegment(Log.scala:500)
	at kafka.log.Log.$anonfun$loadSegmentFiles$3(Log.scala:482)
	at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:792)
	at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
	at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:791)
	at kafka.log.Log.loadSegmentFiles(Log.scala:454)
	at kafka.log.Log.$anonfun$loadSegments$1(Log.scala:565)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
	at kafka.log.Log.retryOnOffsetOverflow(Log.scala:2069)
	at kafka.log.Log.loadSegments(Log.scala:559)
	at kafka.log.Log.<init>(Log.scala:292)
	at kafka.log.Log$.apply(Log.scala:2203)
	at kafka.log.LogManager.loadLog(LogManager.scala:275)
	at kafka.log.LogManager.$anonfun$loadLogs$12(LogManager.scala:345)
	at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:63)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Input/output error
	at sun.nio.ch.FileDispatcherImpl.pread0(Native Method)
	at sun.nio.ch.FileDispatcherImpl.pread(FileDispatcherImpl.java:52)
	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:220)
	at sun.nio.ch.IOUtil.read(IOUtil.java:197)
	at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:735)
	at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:721)
	at org.apache.kafka.common.utils.Utils.readFully(Utils.java:952)
	at org.apache.kafka.common.utils.Utils.readFullyOrFail(Utils.java:925)
	at org.apache.kafka.common.record.FileLogInputStream$FileChannelRecordBatch.loadBatchWithSize(FileLogInputStream.java:213)
	... 33 more
posted @ 2022-10-01 11:31  打倒资本主义  阅读(254)  评论(0)    收藏  举报