mongodb副本集更换主节点计划

现状及需求

  • 有一个3节点的mongodb副本集,由于主节点(10.30.3.232阿里云)内存较小经常发生oom,需要将配置从 8C-64GB-2TB 升级到 16C-128GB-2TB
  • 由于很多程序在mongo连接池中配置了3个ip,希望升配之后不改变该节点的 ip 地址,这样就不用调整代码了
角色 ip端口 rs.conf().id rs.conf().priority 备注
从节点 10.30.3.231:3717 "_id" : 1 "priority" : 10 _id及priority取自当前 rs.conf();
主节点 10.30.3.232:3717 "_id" : 2 "priority" : 15 此节点为 primary(priority最高)
从节点 10.30.3.233:3717 "_id" : 3 "priority" : 5 -

注:方便起见,以下将登陆mongo的命令简化为mongo ip:port

[root@localhost ~]# mongo 10.30.3.232:3717
set3717:PRIMARY> rs.conf()
{
    "members" : [
        {"_id" : 1, "host" : "10.30.3.231:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 10, "votes" : 1},
        {"_id" : 2, "host" : "10.30.3.232:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 15, "votes" : 1},
        {"_id" : 3, "host" : "10.30.3.233:3717", "arbiterOnly" : false, "hidden" : false, "priority" :  5, "votes" : 1}
    ]
}

[root@localhost ~]# mongo 10.30.3.232:3717
set3717:PRIMARY> rs.status()
{
    "members" : [
        {"_id" : 1,"name" : "10.30.3.231:3717","state" : 2,"stateStr" : "SECONDARY","syncingTo" : "10.30.3.232:3717"},
        {"_id" : 2,"name" : "10.30.3.232:3717","state" : 1,"stateStr" : "PRIMARY"},
        {"_id" : 3,"name" : "10.30.3.233:3717","state" : 2,"stateStr" : "SECONDARY","syncingTo" : "10.30.3.232:3717"}
    ]
}

实施步骤

  • 1.申请新机器,配置 16C-128GB-2TB ,ip随机,比如 10.30.3.234

  • 2.新节点(10.30.3.234):解压和配置mongo(先不用启动),版本和配置文件与老节点(10.30.3.232)一致,注意参数 wiredTigerCacheSizeGB 应参考物理内存

[root@localhost ~]# yum install -y numactl
[root@localhost ~]# cd /opt/
[root@localhost ~]# tar xvf mongodb-linux-x86_64-4.0.25.tgz 
[root@localhost ~]# mv mongodb-linux-x86_64-4.0.25 mongodb

[root@localhost ~]# cat /data/mongo_3717/mongo_3717.conf
quiet=false
timeStampFormat=iso8601-local
logappend=true
journal=true
auth=true
noprealloc=true
directoryperdb=true
fork=true
oplogSize=20480
replSet=set3717
bind_ip=0.0.0.0
port=3717
profile=1
slowms=500
cpu=true
storageEngine=wiredTiger
wiredTigerCacheSizeGB=96
wiredTigerDirectoryForIndexes=true
wiredTigerIndexPrefixCompression=true
wiredTigerJournalCompressor=zlib
wiredTigerCollectionBlockCompressor=snappy
dbpath=/data/mongo_3717/data
keyFile=/data/mongo_3717/data/keyfile
pidfilepath=/data/mongo_3717/pid/pid_3717.pid
logpath=/data/mongo_3717/log/mongo_3717.log
  • 3.由于xxx库太大,特意将该库目录(/data/mongo_3717/data/xxx)挂载到单独的磁盘(比如 /dev/vdb 挂载为 /data2)上
mkdir -p /data2/mongo_3717/data/xxx
ln -sv /data2/mongo_3717/data/xxx /data/mongo_3717/data/xxx
  • 4.在mongo副本集的主节点(10.30.3.232)上,确认无延迟
[root@localhost ~]# echo "rs.printSlaveReplicationInfo();" | mongo 10.30.3.232:3717 | grep behind
        0 secs (0 hrs) behind the primary 
        0 secs (0 hrs) behind the primary
  • 5.在mongo副本集的主节点(10.30.3.232)上,执行如下命令,将 主节点(根据 rs.conf() 最新返回的节点顺序<从0开始,并不是 _id 数字>,此刻主节点(10.30.3.232) 为 members[1]) 的priority调到最低(由15改为1),修改优先级会自动触发选举,此时 10.30.3.231 的优先级最高而变为主
[root@localhost ~]# mongo 10.30.3.232:3717
set3717:PRIMARY> rs.conf()
    "members" : [
        {"_id" : 1, "host" : "10.30.3.231:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 10, "votes" : 1},
        {"_id" : 2, "host" : "10.30.3.232:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 15, "votes" : 1},
        {"_id" : 3, "host" : "10.30.3.233:3717", "arbiterOnly" : false, "hidden" : false, "priority" :  5, "votes" : 1}
    ]

[root@localhost ~]# mongo 10.30.3.232:3717
set3717:PRIMARY> var cfg=rs.conf()
set3717:PRIMARY> cfg.members[1].priority=1;
set3717:PRIMARY> rs.reconfig(cfg);
set3717:SECONDARY> exit
  • 6.发生failover之后,在mongo副本集新的主节点10.30.3.231)上,执行如下命令,将老的节点(10.30.3.232)移除
[root@localhost ~]# mongo 10.30.3.231:3717
set3717:PRIMARY> rs.remove("10.30.3.232:3717")
set3717:PRIMARY> rs.conf();
set3717:PRIMARY> exit
  • 7.更换ip:将2台机器关机(阿里云更改ip地址需要关机),将老机器的内网ip从 10.30.3.232 改为 10.30.3.235,再将新机器的内网ip从 10.30.3.234 改为 10.30.3.232,最后将新机器开机(注:老机器关机3天再退订)

  • 8.新节点(10.30.3.232):待服务器开机后,启动 mongodb

[root@localhost ~]# numactl --interleave=all /opt/mongodb/bin/mongod -f /data/mongo_3717/mongo_3717.conf
  • 9.在mongo副本集的主节点(10.30.3.231)上,执行如下命令,将新的节点(10.30.3.232)加入(为了保持_id与老节点一致,示例指定 _id 为2),调整priority为最低,此时新节点的数据将从0初始化(注:新节点不用做备份+还原哦
[root@localhost ~]# mongo 10.30.3.231:3717
set3717:PRIMARY> rs.add({ _id:2, host:"10.30.3.232:3717", priority:1, votes:1});
set3717:PRIMARY> rs.conf();
set3717:PRIMARY> exit
  • 10.关注延迟,等待初始化完成

2023.08.28.新节点(千兆内网)初始化828GB共耗时05:30:00(10:57-16:27),2025.01.01.初始化1.6TB共耗时08:24:00(08:38-17:02)

[root@localhost ~]# echo "rs.printSlaveReplicationInfo();" | mongo 10.30.3.231:3717 | grep behind
        0 secs (0 hrs) behind the primary 
        0 secs (0 hrs) behind the primary
  • 11.待新节点的数据同步完,再将新节点(根据 rs.conf() 最新返回的节点顺序<从0开始,并不是 _id 数字>,此刻新节点(10.30.3.232) 为 members[2])的priority调到最高(由1改为15),修改优先级会自动触发选举,此时新节点(10.30.3.232)的优先级最高而变为主
[root@localhost ~]# mongo 10.30.3.231:3717
set3717:PRIMARY> rs.conf()
    "members" : [
        {"_id" : 1, "host" : "10.30.3.231:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 10, "votes" : 1},
        {"_id" : 3, "host" : "10.30.3.233:3717", "arbiterOnly" : false, "hidden" : false, "priority" :  5, "votes" : 1}
        {"_id" : 2, "host" : "10.30.3.232:3717", "arbiterOnly" : false, "hidden" : false, "priority" :  1, "votes" : 1},
    ]

[root@localhost ~]# mongo 10.30.3.231:3717
set3717:PRIMARY> var cfg=rs.conf()
set3717:PRIMARY> cfg.members[2].priority=15;
set3717:PRIMARY> rs.reconfig(cfg);
set3717:SECONDARY> exit
  • 12.确认当前primary节点
[root@localhost ~]# echo "rs.isMaster();" | mongo 10.30.3.232:3717 | grep primary
        "primary" : "10.30.3.232:3717",
Copyright © www.sqlfans.cn 2024 All Right Reserved更新时间: 2025-01-01 18:03:11

results matching ""

    No results matching ""