ecs迁移导致rabbitmq集群异常

故障场景

  • 2023.02.11.将rabbitmq集群所在的3台ecs做了迁移,迁移之后(机器名变了)集群就异常了,集群状态报错如下:
[root@u-69197-iot ~]# rabbitmqctl cluster_status
Error: unable to perform an operation on node 'rabbit@u-69197-iot'. Please see diagnostics information and suggestions below.

Most common reasons for this are:

 * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
 * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
 * Target node is not running

In addition to the diagnostics info below:

 * See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
 * Consult server logs on node rabbit@u-69197-iot
 * If target node is configured to use long node names, don't forget to use --longnames with CLI tools

DIAGNOSTICS
===========

attempted to contact: ['rabbit@u-69197-iot']

rabbit@u-69197-iot:
  * connected to epmd (port 4369) on u-69197-iot
  * epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic 
  * TCP connection succeeded but Erlang distribution failed 

  * Node name (or hostname) mismatch: node "rabbit@iZbp12le2q9si3aqiyr4zfZ" believes its node name is not "rabbit@iZbp12le2q9si3aqiyr4zfZ" but something else.
    All nodes and CLI tools must refer to node "rabbit@iZbp12le2q9si3aqiyr4zfZ" using the same name the node itself uses (see its logs to find out what it is)


Current node details:
 * node name: 'rabbitmqcli-1739-rabbit@u-69197-iot'
 * effective user's home directory: /root
 * Erlang cookie hash: 2Hi4RI3kn7cLq4+Xw+tVlQ==

核心报错

  • TCP connection succeeded but Erlang distribution failed
  • Node name (or hostname) mismatch: node "rabbit@iZbp12le2q9si3aqiyr4zfZ" believes its node name is not "rabbit@iZbp12le2q9si3aqiyr4zfZ" but something else.

解决方案

  • 第1步,[rabbitmq集群的所有节点],修改主机名
#.节点1:hostnamectl set-hostname u-69197-iot
#.节点2:hostnamectl set-hostname u-248214-iot
#.节点3:hostnamectl set-hostname u-10099-iot
  • 第2步,[rabbitmq集群的所有节点],修改 vi /etc/hosts
192.168.0.61    u-69197-iot
192.168.0.62    u-248214-iot
192.168.0.63    u-10099-iot
  • 第3步,[rabbitmq集群的主节点],重启mq服务(若停止失败可尝试 kill)
rabbitmqctl stop
rabbitmq-server -detached
  • 第4步,[rabbitmq集群的主节点],其他节点重新加入集群
rabbitmqctl stop_app
rabbitmqctl join_cluster --ram rabbit@u-69197-iot
rabbitmqctl start_app
  • 第5步,[rabbitmq集群的任一节点],确认集群状态已恢复
rabbitmqctl cluster_status
Copyright © www.sqlfans.cn 2023 All Right Reserved更新时间: 2024-06-03 10:40:39

results matching ""

    No results matching ""