• 博客园logo
  • 会员
  • 众包
  • 新闻
  • 博问
  • 闪存
  • 赞助商
  • HarmonyOS
  • Chat2DB
    • 搜索
      所有博客
    • 搜索
      当前博客
  • 写随笔 我的博客 短消息 简洁模式
    用户头像
    我的博客 我的园子 账号设置 会员中心 简洁模式 ... 退出登录
    注册 登录
一泽涟漪
时光荏苒 白驹过隙
博客园    首页    新随笔    联系   管理    订阅  订阅
Cassanda节点重启后无法加入集群并报错“received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397”

目前环境有一套6节点2数据中心的cassandra集群,版本为2.1.9。

今天将集群中一台机器10.168.12.3重启后发现该节点无法加入集群,现象分析。

在重启后的节点查看集群状态,发现集群状态一切正常。

$ nodetool status
Datacenter: DC-SGM-DR
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns    Host ID                               Rack
UN  10.168.50.205  822.91 MB  256     ?       bea84e24-76c8-4070-9c41-d0051d8aba63  RAC-1B
UN  10.168.50.212  825.43 MB  256     ?       97e92d11-028a-44f6-b6ea-be3992985506  RAC-1B
UN  10.168.50.213  14.37 GB   256     ?       de47960c-54ab-4ed3-99e7-e3abcb66c014  RAC-1B
Datacenter: DC-SGM-SH
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns    Host ID                               Rack
UN  10.168.11.11   10.17 GB   256     ?       9d016b9f-5655-4899-8652-607bdc24eda3  RAC-1A
UN  10.168.12.3    831.42 MB  256     ?       57c4d98b-c52c-48bf-b8ee-7d8f22bcc08f  RAC-1A
UN  10.168.11.6    828.2 MB   256     ?       9cf69121-4dbc-419c-b3a8-e166d83b4177  RAC-1A

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless

我们登录集群其他节点查看集群状态

$ nodetool status
Datacenter: DC-SGM-DR
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns    Host ID                               Rack
UN  10.168.50.205  828.16 MB  256     ?       bea84e24-76c8-4070-9c41-d0051d8aba63  RAC-1B
UN  10.168.50.212  825.43 MB  256     ?       97e92d11-028a-44f6-b6ea-be3992985506  RAC-1B
UN  10.168.50.213  14.37 GB   256     ?       de47960c-54ab-4ed3-99e7-e3abcb66c014  RAC-1B
Datacenter: DC-SGM-SH
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns    Host ID                               Rack
UN  10.168.11.11   834.48 MB  256     ?       9d016b9f-5655-4899-8652-607bdc24eda3  RAC-1A
DN  10.168.12.3    831.31 MB  256     ?       57c4d98b-c52c-48bf-b8ee-7d8f22bcc08f  RAC-1A
UN  10.168.11.6    828.17 MB  256     ?       9cf69121-4dbc-419c-b3a8-e166d83b4177  RAC-1A

Note: Non-system keyspaces don't have the same replication settings, effective ownership information is meaningless

我们发现集群其他节点显示被重启的节点为“DN”状态,并在各节点的cassandra的system.log文件报错

..................................................
WARN  [GossipStage:1] 2020-01-02 10:07:45,831 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:07:47,680 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:07:49,682 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:07:50,690 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:07:50,833 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:07:51,681 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:07:51,833 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:07:52,833 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:07:54,684 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:07:55,683 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:07:55,834 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:07:57,683 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:07:58,684 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:00,684 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:01,688 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:05,686 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:06,686 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:08,838 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:09,839 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:11,688 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:11,839 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:12,840 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:13,688 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:17,841 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:20,690 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:21,691 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:21,843 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:22,691 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
WARN  [GossipStage:1] 2020-01-02 10:08:22,843 Gossiper.java:1105 - received an invalid gossip generation for peer /10.168.12.3; local generation = 1527840276, received generation = 1577928397
..................................................

我们登录被重启的cassandra节点查看gossipinfo

$ nodetool gossipinfo
.................................
/10.168.12.3
  generation:1527840276
  heartbeat:22488596
  HOST_ID:57c4d98b-c52c-48bf-b8ee-7d8f22bcc08f
  SCHEMA:54b29ca7-5a9c-345b-be73-437504faf71b
  SEVERITY:0.0
  NET_VERSION:8
  RACK:RAC-1A
  DC:DC-SGM-SH
  RELEASE_VERSION:2.1.9
  STATUS:NORMAL,-101651619030947983
  RPC_ADDRESS:10.168.12.3
  LOAD:8.72963151E8
.................................

可以看到其他节点记录重启节点的generation的epoch为1527840276,我们转换成可读时间为2018年6月1日FridayAM8点04分,该时间为我们启动cassandra的时间,登录重启节点,查看local表的

cqlsh `hostname` -u cassandra
cassandra@cqlsh> use system;
cassandra@cqlsh:system> select key , gossip_generation from local ;

 key   | gossip_generation
-------+-------------------
 local |        1577928397

(1 rows)

将1577928397转换为2020年1月2日ThursdayAM1点26分,可以看到两个时间点之间间隔一年半时间,也就是说上次cassandra启动的时间还是2018年6月1日FridayAM8点04分,其实这次重启触发了一个cassandra的bug

https://issues.apache.org/jira/browse/CASSANDRA-10969

https://support.datastax.com/hc/en-us/articles/115001096783-Nodes-showing-DN-in-nodetool-status-with-invalid-gossip-generation-warning-in-logs

可以查看大牛写的blog

https://mash213.wordpress.com/2019/07/05/scylla-received-an-invalid-gossip-generation-for-peer-how-to-resolve/

我们依次将集群节点重启。

===================来自一泽涟漪的博客,转载请标明出处 www.cnblogs.com/ilifeilong===================
posted on 2020-01-02 16:09  一泽涟漪  阅读(684)  评论(0)    收藏  举报
刷新页面返回顶部
博客园  ©  2004-2025
浙公网安备 33010602011771号 浙ICP备2021040463号-3