mongodb副本集更换主节点计划
现状及需求
- 有一个3节点的mongodb副本集,由于主节点(
10.30.3.232
,阿里云)内存较小经常发生oom,需要将配置从 8C-64GB-2TB 升级到 16C-128GB-2TB - 由于很多程序在mongo连接池中配置了3个ip,希望升配之后不改变该节点的 ip 地址,这样就不用调整代码了
角色 | ip端口 | rs.conf().id | rs.conf().priority | 备注 |
---|---|---|---|---|
从节点 | 10.30.3.231:3717 | "_id" : 1 | "priority" : 10 | _id及priority取自当前 rs.conf(); |
主节点 | 10.30.3.232:3717 | "_id" : 2 | "priority" : 15 | 此节点为 primary(priority最高) |
从节点 | 10.30.3.233:3717 | "_id" : 3 | "priority" : 5 | - |
注:方便起见,以下将登陆mongo的命令简化为
mongo ip:port
[root@localhost ~]# mongo 10.30.3.232:3717
set3717:PRIMARY> rs.conf()
{
"members" : [
{"_id" : 1, "host" : "10.30.3.231:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 10, "votes" : 1},
{"_id" : 2, "host" : "10.30.3.232:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 15, "votes" : 1},
{"_id" : 3, "host" : "10.30.3.233:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 5, "votes" : 1}
]
}
[root@localhost ~]# mongo 10.30.3.232:3717
set3717:PRIMARY> rs.status()
{
"members" : [
{"_id" : 1,"name" : "10.30.3.231:3717","state" : 2,"stateStr" : "SECONDARY","syncingTo" : "10.30.3.232:3717"},
{"_id" : 2,"name" : "10.30.3.232:3717","state" : 1,"stateStr" : "PRIMARY"},
{"_id" : 3,"name" : "10.30.3.233:3717","state" : 2,"stateStr" : "SECONDARY","syncingTo" : "10.30.3.232:3717"}
]
}
实施步骤
1.申请新机器,配置 16C-128GB-2TB ,ip随机,比如
10.30.3.234
2.新节点(
10.30.3.234
):解压和配置mongo(先不用启动),版本和配置文件与老节点(10.30.3.232
)一致,注意参数wiredTigerCacheSizeGB
应参考物理内存
[root@localhost ~]# yum install -y numactl
[root@localhost ~]# cd /opt/
[root@localhost ~]# tar xvf mongodb-linux-x86_64-4.0.25.tgz
[root@localhost ~]# mv mongodb-linux-x86_64-4.0.25 mongodb
[root@localhost ~]# cat /data/mongo_3717/mongo_3717.conf
quiet=false
timeStampFormat=iso8601-local
logappend=true
journal=true
auth=true
noprealloc=true
directoryperdb=true
fork=true
oplogSize=20480
replSet=set3717
bind_ip=0.0.0.0
port=3717
profile=1
slowms=500
cpu=true
storageEngine=wiredTiger
wiredTigerCacheSizeGB=96
wiredTigerDirectoryForIndexes=true
wiredTigerIndexPrefixCompression=true
wiredTigerJournalCompressor=zlib
wiredTigerCollectionBlockCompressor=snappy
dbpath=/data/mongo_3717/data
keyFile=/data/mongo_3717/data/keyfile
pidfilepath=/data/mongo_3717/pid/pid_3717.pid
logpath=/data/mongo_3717/log/mongo_3717.log
- 3.由于xxx库太大,特意将该库目录(
/data/mongo_3717/data/xxx
)挂载到单独的磁盘(比如/dev/vdb
挂载为/data2
)上
mkdir -p /data2/mongo_3717/data/xxx
ln -sv /data2/mongo_3717/data/xxx /data/mongo_3717/data/xxx
- 4.在mongo副本集的主节点(
10.30.3.232
)上,确认无延迟
[root@localhost ~]# echo "rs.printSlaveReplicationInfo();" | mongo 10.30.3.232:3717 | grep behind
0 secs (0 hrs) behind the primary
0 secs (0 hrs) behind the primary
- 5.在mongo副本集的主节点(
10.30.3.232
)上,执行如下命令,将 主节点(根据 rs.conf() 最新返回的节点顺序<从0开始,并不是 _id 数字>,此刻主节点(10.30.3.232) 为 members[1]) 的priority调到最低(由15改为1),修改优先级会自动触发选举,此时10.30.3.231
的优先级最高而变为主
[root@localhost ~]# mongo 10.30.3.232:3717
set3717:PRIMARY> rs.conf()
"members" : [
{"_id" : 1, "host" : "10.30.3.231:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 10, "votes" : 1},
{"_id" : 2, "host" : "10.30.3.232:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 15, "votes" : 1},
{"_id" : 3, "host" : "10.30.3.233:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 5, "votes" : 1}
]
[root@localhost ~]# mongo 10.30.3.232:3717
set3717:PRIMARY> var cfg=rs.conf()
set3717:PRIMARY> cfg.members[1].priority=1;
set3717:PRIMARY> rs.reconfig(cfg);
set3717:SECONDARY> exit
- 6.发生failover之后,在mongo副本集新的主节点(
10.30.3.231
)上,执行如下命令,将老的节点(10.30.3.232
)移除
[root@localhost ~]# mongo 10.30.3.231:3717
set3717:PRIMARY> rs.remove("10.30.3.232:3717")
set3717:PRIMARY> rs.conf();
set3717:PRIMARY> exit
7.更换ip:将2台机器关机(阿里云更改ip地址需要关机),将老机器的内网ip从
10.30.3.232
改为10.30.3.235
,再将新机器的内网ip从10.30.3.234
改为10.30.3.232
,最后将新机器开机(注:老机器关机3天再退订)8.新节点(
10.30.3.232
):待服务器开机后,启动 mongodb
[root@localhost ~]# numactl --interleave=all /opt/mongodb/bin/mongod -f /data/mongo_3717/mongo_3717.conf
- 9.在mongo副本集的主节点(
10.30.3.231
)上,执行如下命令,将新的节点(10.30.3.232
)加入(为了保持_id与老节点一致,示例指定 _id 为2),调整priority为最低,此时新节点的数据将从0初始化(注:新节点不用做备份+还原哦)
[root@localhost ~]# mongo 10.30.3.231:3717
set3717:PRIMARY> rs.add({ _id:2, host:"10.30.3.232:3717", priority:1, votes:1});
set3717:PRIMARY> rs.conf();
set3717:PRIMARY> exit
- 10.关注延迟,等待初始化完成
2023.08.28.新节点(千兆内网)初始化828GB共耗时05:30:00(10:57-16:27),2025.01.01.初始化1.6TB共耗时08:24:00(08:38-17:02)
[root@localhost ~]# echo "rs.printSlaveReplicationInfo();" | mongo 10.30.3.231:3717 | grep behind
0 secs (0 hrs) behind the primary
0 secs (0 hrs) behind the primary
- 11.待新节点的数据同步完,再将新节点(根据 rs.conf() 最新返回的节点顺序<从0开始,并不是 _id 数字>,此刻新节点(10.30.3.232) 为 members[2])的priority调到最高(由1改为15),修改优先级会自动触发选举,此时新节点(
10.30.3.232
)的优先级最高而变为主
[root@localhost ~]# mongo 10.30.3.231:3717
set3717:PRIMARY> rs.conf()
"members" : [
{"_id" : 1, "host" : "10.30.3.231:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 10, "votes" : 1},
{"_id" : 3, "host" : "10.30.3.233:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 5, "votes" : 1}
{"_id" : 2, "host" : "10.30.3.232:3717", "arbiterOnly" : false, "hidden" : false, "priority" : 1, "votes" : 1},
]
[root@localhost ~]# mongo 10.30.3.231:3717
set3717:PRIMARY> var cfg=rs.conf()
set3717:PRIMARY> cfg.members[2].priority=15;
set3717:PRIMARY> rs.reconfig(cfg);
set3717:SECONDARY> exit
- 12.确认当前primary节点
[root@localhost ~]# echo "rs.isMaster();" | mongo 10.30.3.232:3717 | grep primary
"primary" : "10.30.3.232:3717",