redis哨兵搭建

[TOC]

安装 redis 单节点

  • [所有节点] 快速安装 Redis 7.0.11,示例 数据目录 /data/redis_${port},端口7100
cd /opt/
wget -c http://iso.sqlfans.cn/redis/redis-7.0.11.tar.gz
wget -c http://iso.sqlfans.cn/redis/install_redis_7011.sh
sh install_redis_7011.sh /data 7100
  • 确认读写及版本信息
echo "set dba kevin" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -p 7100 2>/dev/null
echo "get dba" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -p 7100 2>/dev/null
echo "save" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -p 7100 2>/dev/null
echo "info server" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -p 7100 2>/dev/null | grep redis_version

搭建 redis 哨兵

  • 机器规划清单
IP地址 角色 redis端口 sentinel端口 操作系统 数据目录
10.30.3.231 主库1 7100 7200 CentOS 7.9 x64 /data/redis_7100
10.30.3.232 从库2 7100 7200 CentOS 7.9 x64 /data/redis_7100
10.30.3.233 从库3 7100 7200 CentOS 7.9 x64 /data/redis_7100

1.安装 redis 单节点

  • 所有节点:关于 redis 的单节点安装,可参考上面
  • 所有节点:关于 redis 的配置,可参考 cat /data/redis_7100/redis_7100.conf

2.搭建 redis 主从

  • 2.1.主节点:添加 masterauth 密码(与requirepass保持一致),主库不用配置replicaof
cat /data/redis_7100/redis_7100.conf | grep "^masterauth" || echo "masterauth RbY9k2_NBf1QWy8I" >> /data/redis_7100/redis_7100.conf
  • 2.2.从节点:添加 masterauth 及 replicaof 参数,指定master的ip(示例10.30.3.231)和port(示例7100
cat /data/redis_7100/redis_7100.conf | grep "^masterauth" || echo "masterauth RbY9k2_NBf1QWy8I" >> /data/redis_7100/redis_7100.conf
cat /data/redis_7100/redis_7100.conf | grep "^replicaof" || echo "replicaof 10.30.3.231 7100" >> /data/redis_7100/redis_7100.conf
  • 2.3.所有节点:重启redis服务,并确认主从状态
#.不可禁用CONFIG命令否则会导致哨兵无法故障转移
sed -i "s/^rename-command CONFIG/#rename-command CONFIG/" /data/redis_7100/redis_7100.conf

/usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -h 127.0.0.1 -p 7100 shutdown
sudo -u redis /usr/local/bin/redis-server /data/redis_7100/redis_7100.conf
echo "info replication" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -p 7100 2>/dev/null | egrep "(role|slave0|slave1)"

3.搭建 reidis 哨兵

  • 3.1.所有节点:配置哨兵,sentinel monitor指定主节点信息,mymaster指定主节点的名称(自定义),10.30.3.231 7100指定主节点的ip和端口,2指定选举master时的quorum值
cat > /data/redis_7100/sentinel_7200.conf <<EOF
bind 0.0.0.0
port 7200
daemonize yes
supervised systemd
logfile /data/redis_7100/log/sentinel.log
pidfile /data/redis_7100/pid/sentinel.pid
dir /data/redis_7100/dump
sentinel monitor mymaster 10.30.3.231 7100 2
sentinel auth-pass mymaster RbY9k2_NBf1QWy8I
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 10000
EOF
  • 3.2.所有节点:启动哨兵进程
chown -R redis.redis /data/redis_7100
chmod 600 /data/redis_7100/*.conf
sudo -u redis /usr/local/bin/redis-sentinel /data/redis_7100/sentinel_7200.conf
netstat -lntp | egrep "(7100|7200)"
  • 3.3.所有节点:添加到开机启动
cat /etc/rc.local | grep sentinel_7200 || echo "sudo -u redis /usr/local/bin/redis-sentinel /data/redis_7100/sentinel_7200.conf" >> /etc/rc.local

4.验证集群

  • 4.1.主节点:写入数据
echo "info replication" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -p 7100 2>/dev/null | egrep "(role|slave0|slave1)"
echo "set dev sam" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -p 7100 2>/dev/null
echo "get dev" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -p 7100 2>/dev/null
  • 4.2.从节点:确认集群状态 + 只读状态 + 数据同步
echo "info replication" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -p 7100 2>/dev/null | egrep "(role|slave0|slave1)"
echo "get dev" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -p 7100 2>/dev/null
echo "set dev sam" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -p 7100 2>/dev/null
  • 4.3.任意节点:查看哨兵状态
/usr/local/bin/redis-cli -c -h 10.30.3.231 -p 7200 info sentinel | grep master
/usr/local/bin/redis-cli -c -h 10.30.3.232 -p 7200 info sentinel | grep master
/usr/local/bin/redis-cli -c -h 10.30.3.233 -p 7200 info sentinel | grep master

5.灾难演练 - 模拟主库宕机

  • 5.1.主节点:将redis进程杀掉,模拟主库宕机
ps -ef | grep redis | grep -v grep | awk '{print $2}' | xargs kill -9 2> /dev/null
  • 5.2.任意节点:验证 who 为当前主节点
echo "info replication" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -h 10.30.3.231 -p 7100 2>/dev/null | egrep "(role|slave0|slave1)"
echo "info replication" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -h 10.30.3.232 -p 7100 2>/dev/null | egrep "(role|slave0|slave1)"
echo "info replication" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -h 10.30.3.233 -p 7100 2>/dev/null | egrep "(role|slave0|slave1)"
  • 5.3.任意从节点:查看哨兵的故障转移过程,总结:主节点宕机会触发选举,并自动故障转移
cat /data/redis_7100/log/sentinel.log | egrep "(+monitor|+sentinel|+sdown|+vote-for-leader|+odown|+config-update-from|+switch-master|+slave)" | tail -n20

注:附上一个完整的故障转移的demo日志:curl http://iso.sqlfans.cn/redis/switch_master_log.txt

  • 5.4.故障节点:测试完毕,再将redis及哨兵启动
sudo -u redis /usr/local/bin/redis-server /data/redis_7100/redis_7100.conf
sudo -u redis /usr/local/bin/redis-sentinel /data/redis_7100/sentinel_7200.conf
netstat -lntp | egrep "(7100|7200)"

6.灾难演练 - 模拟从库宕机

  • 6.1.任意从节点(假设reids03):将redis进程杀掉,模拟从库宕机
ps -ef | grep redis | grep -v grep | awk '{print $2}' | xargs kill -9 2> /dev/null
  • 6.2.任意节点:验证 who 为当前主节点
echo "info replication" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -h 10.30.3.231 -p 7100 2>/dev/null | egrep "(role|slave0|slave1)"
echo "info replication" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -h 10.30.3.232 -p 7100 2>/dev/null | egrep "(role|slave0|slave1)"
echo "info replication" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -h 10.30.3.233 -p 7100 2>/dev/null | egrep "(role|slave0|slave1)"
  • 6.3.主节点(假设reids01):主库写入数据,其他从库(假设reids02)正常同步,总结:单个从库宕机不影响哨兵同步
echo "set dev2 sam2" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -h 10.30.3.231 -p 7100 2>/dev/null
echo "get dev2" | /usr/local/bin/redis-cli -a RbY9k2_NBf1QWy8I -c -h 10.30.3.232 -p 7100 2>/dev/null
  • 6.4.故障节点:测试完毕,再将redis及哨兵启动
sudo -u redis /usr/local/bin/redis-server /data/redis_7100/redis_7100.conf
sudo -u redis /usr/local/bin/redis-sentinel /data/redis_7100/sentinel_7200.conf
netstat -lntp | egrep "(7100|7200)"

遇到的问题

场景1:如何彻底卸载redis

  • 参考redis的安装过程,可按照如下步骤彻底卸载redis
cd /opt/
ps -ef | grep redis | grep -v grep | awk '{print $2}' | xargs kill -9 2> /dev/null
userdel -r redis 2> /dev/null
rm -f /opt/install_redis_*.sh
rm -rf /data/redis*
rm -rf /usr/local/redis*
rm -f /usr/local/bin/redis*
sed -i '/redis/d' /etc/rc.local
netstat -lnpt | grep redis

场景2:nacos如何配置redis连接串

  • 如果redis为单机、主备、读写分离、Proxy集群实例,那么在nacos配置的时候,请使用redis服务的ip和端口
spring.redis.host=10.30.3.231
spring.redis.port=7100
spring.redis.password=<password>
  • 如果redis为哨兵模式,那么在nacos配置的时候,请使用哨兵服务的ip和端口
spring.redis.sentinel.master=mymaster
spring.redis.sentinel.nodes=10.30.3.231:7200,10.30.3.232:7200,10.30.3.233:7200
spring.redis.password=<password>
  • 如果redis为Cluster集群实例,比如3主3从的架构,请使用所有节点的ip和端口,更多参考 Jedis客户端连接Redis
spring.redis.cluster.nodes=<ip1:port1>,<ip2:port2>,<ip3:port3>,<ip4:port4>,<ip5:port5>,<ip6:port6>
spring.redis.password=<password>

场景3:主节点宕机不会发生故障转移

  • 症状:2023.07.21,将主节点的redis进程杀掉,模拟主库宕机,但redis哨兵不会发生故障转移,执行 info replication 看到2个从节点仍然是slave
  • 原因:由于最佳化安装的时候,配置了 rename-command CONFIG "" 禁止修改配置项,从而导致主机点宕机的时候无法发生故障转移
  • 解决:将禁用CONFIG的这一行注释掉并重启redis服务即可,即 #rename-command CONFIG ""
Copyright © www.sqlfans.cn 2024 All Right Reserved更新时间: 2024-06-17 11:14:59

results matching ""

    No results matching ""