架构图如下:
1.按照架构图所示,准备机器,做好时间同步,主机名解析
192.168.42.150 node1 [proxySQL keepalived]
192.168.42.151 node2 [proxySQL keepalived]
192.168.42.152 node3 [mysql-master wha]
192.168.42.153 node4 [mysql-slave1 wha]
192.168.42.154 node5 [mysql-slave2 wha]
192.168.42.155 node6 [mysql-slave3 wha]
2.我们先做主从复制
(1).在node3,node4,node5,node6上分别安装mariadb
yum install mariadb mariadb-server -y
(2).node3配置/etc/my.cnf.d/server.conf
node3[master]:
vim /etc/my.cnf.d/server.cnf [mysqld] innodb_file_per_table=ON skip_name_resolve=ON server_id = 1 log_bin = log-bin
(3).启动node3节点的mariadb
systemctl start mariadb.service
(4).登录mysql,创建主从复制账号
grant all privileges on *.* to 'mageedu'@'192.168.42.%' identified by '123456' grant replication slave,replication client on *.* to 'repuser'@'192.168.42.%' identified by 'repass'; flush privileges;
(5).配置其他从节点
node3:
[mysqld] innodb_file_per_table=ON skip_name_resolve=ON server_id = 11 relay_log=relay-log read_only=ON
node4:
[mysqld] innodb_file_per_table=ON skip_name_resolve=ON server_id = 12 relay_log=relay-log read_only=ON
node5:
[mysqld] innodb_file_per_table=ON skip_name_resolve=ON server_id = 13 relay_log=relay-log read_only=ON
配置完之后,启动mariadb
(6).启动从节点的slave(各个节点)
写这个之前需要在master节点上查看:
MariaDB [(none)]> show master logs; +----------------+-----------+ | Log_name | File_size | +----------------+-----------+ | log-bin.000001 | 30331 | | log-bin.000002 | 1038814 | | log-bin.000003 | 514 | | log-bin.000004 | 245 | +----------------+-----------+ 4 rows in set (0.00 sec)
然后启动从节点mysql
CHANGE MASTER TO MASTER_HOST='192.168.42.152',MASTER_USER='repuser',MASTER_PASSWORD='repass',MASTER_LOG_FILE='log-bin.000004',MASTER_LOG_POS=245; START SLAVE ; SHOW SLAVE STATUS\G; MariaDB [(none)]> SHOW SLAVE STATUS\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.42.152 Master_User: repuser Master_Port: 3306 Connect_Retry: 60 Master_Log_File: log-bin.000004 Read_Master_Log_Pos: 245 Relay_Log_File: relay-log.000002 Relay_Log_Pos: 527 Relay_Master_Log_File: log-bin.000004 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 245 Relay_Log_Space: 815 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 1 row in set (0.00 sec)
至此主从复制就完成了
(7).测试主从复制
在master上创建helloword数据库:
MariaDB [(none)]> create database helloword; Query OK, 1 row affected (0.00 sec)
查看一下子:
MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | helloword | | mysql | | performance_schema | | test | +--------------------+ 5 rows in set (0.00 sec)
然后在各个从节点上查看:
MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | helloword | | mysql | | performance_schema | | test | +--------------------+ 5 rows in set (0.01 sec)
测试成功.
3.现在我们来做主从复制的读写分离
(1).在node1,node2上分别安装ProxySQL.
下载ProxySQL:
wget -c ftp://172.16.0.1/pub/Sources/7.x86_64/proxysql/proxysql-1.3.6-1-centos7.x86_64.rpm yum install proxysql-1.3.6-1-centos7.x86_64.rpm -y
安装mariadb客户端
yum install mariadb -y
(2).配置ProxySQL:
vim /etc/proxysql.cnf
配置示例:
datadir="/var/lib/proxysql" admin_variables= { admin_credentials="admin:admin" mysql_ifaces="127.0.0.1:6032;/tmp/proxysql_admin.sock" } mysql_variables= { threads=4 max_connections=2048 default_query_delay=0 default_query_timeout=36000000 have_compress=true poll_timeout=2000 interfaces="0.0.0.0:3306;/tmp/mysql.sock" default_schema="information_schema" stacksize=1048576 server_version="5.5.30" connect_timeout_server=3000 monitor_history=600000 monitor_connect_interval=60000 monitor_ping_interval=10000 monitor_read_only_interval=1500 monitor_read_only_timeout=500 ping_interval_server=120000 ping_timeout_server=500 commands_stats=true sessions_sort=true connect_retries_on_failure=10 } mysql_servers = ( { address = "192.168.42.152" # no default, required . If port is 0 , address is interpred as a Unix Socket Domain port = 3306 # no default, required . If port is 0 , address is interpred as a Unix Socket Domain hostgroup = 0 # no default, required status = "ONLINE" # default: ONLINE weight = 1 # default: 1 compression = 0 # default: 0 }, { address = "192.168.42.153" port = 3306 hostgroup = 1 status = "ONLINE" weight = 1 compression = 0 }, { address = "192.168.42.154" port = 3306 hostgroup = 1 status = "ONLINE" weight = 1 compression = 0 }, { address = "192.168.42.155" port = 3306 hostgroup = 1 status = "ONLINE" weight = 1 compression = 0 } ) mysql_users: ( { username = "mageedu" password = "123456" default_hostgroup = 0 max_connections=1000 default_schema="test" active = 1 } ) mysql_query_rules: ( ) scheduler= ( ) mysql_replication_hostgroups= ( { writer_hostgroup=0 reader_hostgroup=1 } )
(3).启动proxySQL
[root@node2 init.d]# service proxysql start Starting ProxySQL: DONE! [root@node2 init.d]# ss -tnl State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 128 127.0.0.1:6032 *:* LISTEN 0 128 *:22 *:* LISTEN 0 100 127.0.0.1:25 *:* LISTEN 0 1024 *:3306 *:* LISTEN 0 1024 *:3306 *:* LISTEN 0 1024 *:3306 *:* LISTEN 0 1024 *:3306 *:* LISTEN 0 128 :::22 :::* LISTEN 0 100 ::1:25 :::*
(4).连接mysql,查看一下子
[root@node1 ~]# mysql -umageedu -p123456 -h 192.168.42.152 Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 387 Server version: 5.5.52-MariaDB MariaDB Server Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | hellowword | | mysql | | performance_schema | | test | +--------------------+ 5 rows in set (0.01 sec)
(5).将另一个节点node2也配置下,并启动测试一下
4.将node1和node2的proxysql做成高可用(读写分离高可用)
(1).node1和node2分别安装keepalived
yum install keepalived -y
(2).node1的keepalived的配置:
! Configuration File for keepalived global_defs { notification_email { root@localhost } notification_email_from keepalived@localhost smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id node0 vrrp_mcast_group4 224.1.101.23 } #存在文件时,检测成功,即执行降级;否则不存在,全部退出;实现服务器切换 vrrp_script chk_down{ script "[[ -f /etc/keepalived/down ]] && exit 1 || exit 0" interval 1 weight -10 fall 1 rize 1 } #脚本,健康状态检测,检测proxysql是否存活 vrrp_script chk_proxysql { script "killall -0 proxysql && exit 0 || exit 1" interval 1 weight -10 fall 1 rise 1 } vrrp_instance sr1 { state MASTER interface ens33 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass rEiszbuO } virtual_ipaddress { 192.168.42.182/24 dev ens33 label ens33:0 } #脚本调用 track_script { chk_down chk_proxysql } notify_master "/etc/keepalived/notify.sh master" notify_backup "/etc/keepalived/notify.sh backup" notify_fault "/etc/keepalived/notify.sh fault" } (3).node2的keepalived配置 ! Configuration File for keepalived global_defs { notification_email { root@localhost } notification_email_from keepalived@localhost smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id node1 vrrp_mcast_group4 224.1.101.23 } #存在文件时,检测成功,即执行降级;否则不存在,全部退出;实现服务器切换 vrrp_script chk_down{ script "[[ -f /etc/keepalived/down ]] && exit 1 || exit 0" interval 1 weight -10 fall 1 rize 1 } #脚本,健康状态检测,检测proxysql是否存活 vrrp_script chk_proxysql { script "killall -0 proxysql && exit 0 || exit 1" interval 1 weight -10 fall 1 rise 1 } vrrp_instance sr1 { state BACKUP interface ens33 virtual_router_id 51 priority 96 advert_int 1 authentication { auth_type PASS auth_pass rEiszbuO } virtual_ipaddress { 192.168.42.182/24 dev ens33 label ens33:0 } #脚本调用 track_script { chk_down chk_proxysql } notify_master "/etc/keepalived/notify.sh master" notify_backup "/etc/keepalived/notify.sh backup" notify_fault "/etc/keepalived/notify.sh fault" }
(4).notify.sh脚本
#!/bin/bash # contact='root@localhost' notify() { mailsubject="vrrp:$(hostname) to be $1" mailbody="$(hostname) to be $1,vrrp transition, $(date)." echo "$mailbody" | mail -s "$mailsubject" $contact } case $1 in master) notify master service proxysql start ;; backup) notify backup service proxysql start ;; fault) notify fault service proxysql stop ;; *) echo "Usage: $(basename $0) {master|backup|fault}" exit 1 ;; esac
(5).因为keepalived是引用漂移ip地址,所以,我们上面配置的proxysql.conf的IP绑定需要修改
mysql_ifaces="192.168.42.182:6032;/tmp/proxysql_admin.sock" interfaces="192.168.42.182:3306;/tmp/proxysql.sock"
记得是node1和node2都要修改哦!
(6).在node1启动keepalived测试
service proxysql stop killall proxysql systemctl start keepalived ifconfig ens33:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.42.182 netmask 255.255.255.0 broadcast 0.0.0.0 ether 00:0c:29:c1:a3:0a txqueuelen 1000 (Ethernet) [root@node2 keepalived]# ss -tnl State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 128 127.0.0.1:6032 *:* LISTEN 0 128 *:22 *:* LISTEN 0 100 127.0.0.1:25 *:* LISTEN 0 1024 *:3306 *:* LISTEN 0 1024 *:3306 *:* LISTEN 0 1024 *:3306 *:* LISTEN 0 1024 *:3306 *:* LISTEN 0 128 :::22 :::* LISTEN 0 100 ::1:25 :::*
(7).在node2上也启动keepalived
service proxysql stop killall proxysql systemctl start keepalived
此时ifconfig是看不到ens33:0的地址的
可以看到proxysql是启动起来的
[root@node2 keepalived]# ss -tnl State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 128 127.0.0.1:6032 *:* LISTEN 0 128 *:22 *:* LISTEN 0 100 127.0.0.1:25 *:* LISTEN 0 1024 *:3306 *:* LISTEN 0 1024 *:3306 *:* LISTEN 0 1024 *:3306 *:* LISTEN 0 1024 *:3306 *:* LISTEN 0 128 :::22 :::* LISTEN 0 100 ::1:25 :::*
(8).在node1上关闭keepalived
systemctl stop keepalived killall proxysql
(9).在node2上ifconfig查看,192.168.42.182地址是否漂移过去
ifconfig ens33:0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.42.182 netmask 255.255.255.0 broadcast 0.0.0.0 ether 00:0c:29:c1:a3:0a txqueuelen 1000 (Ethernet)
可以看到果然漂移过来了 至此我们的proxysql高可用已经完成了
5.接下来我们做mariadb 主节点的高可用 我们这里的办法是用MHA,将从节点提升为主节点
MHA(Master HA)是一款开源的 MySQL 的高可用程序,它为 MySQL 主从复制架构提供 了 automating master failover 功能。MHA 在监控到 master 节点故障时,会提升其中拥有最新 数据的 slave 节点成为新的 master 节点,在此期间,MHA 会通过于其它从节点获取额外信 息来避免一致性方面的问题。MHA 还提供了 master 节点的在线切换功能,即按需切换 master/slave 节点。
MHA 服务有两种角色,MHA Manager(管理节点)和 MHA Node(数据节点): MHA Manager:通常单独部署在一台独立机器上管理多个 master/slave 集群,每个 master/slave 集群称作一个 application; MHA node:运行在每台 MySQL 服务器上(master/slave/manager),它通过监控具备解析 和清理 logs 功能的脚本来加快故障转移。
(1).在node3 [mariadb master]节点上创建秘钥
ssh-keygen -t rsa -P '' cd .ssh 先复制给自己 ssh-copy-id -i id_rsa.pub root@192.168.42.152 然后复制给node3,node4,node5,node1,node2 ssh-copy-id -i id_rsa.pub root@192.168.42.153 ssh-copy-id -i id_rsa.pub root@192.168.42.154 ssh-copy-id -i id_rsa.pub root@192.168.42.155 ssh-copy-id -i id_rsa.pub root@192.168.42.150 ssh-copy-id -i id_rsa.pub root@192.168.42.151 scp id_rsa id_rsa.pub root@192.168.42.153:~/.ssh/ scp id_rsa id_rsa.pub root@192.168.42.154:~/.ssh/ scp id_rsa id_rsa.pub root@192.168.42.155:~/.ssh/ scp id_rsa id_rsa.pub root@192.168.42.150:~/.ssh/ scp id_rsa id_rsa.pub root@192.168.42.151:~/.ssh/
(2).在node1,node2,node3,node4,node5,node6上下载MHA
cd ~
下载:MHA
wget -c http://192.168.42.26/install_package/down/Sources/mha/mha4mysql-manager-0.56-0.el6.noarch.rpm wget -c http://192.168.42.26/install_package/down/Sources/mha/mha4mysql-node-0.56-0.el6.noarch.rpm
(3).我们使用node1,node2来当管理节点,并做高可用 node1:
yum install mha4mysql* -y
node2同上
(4).我们在node3,node4,node5,node6上安装mha4mysql-node-0.56-0.el6.noarch.rpm即可
yum install mha4mysql-node-0.56-0.el6.noarch.rpm -y
(5).Manger 节点需要为每个监控的 master/slave 集群提供一个专用的配置文件, 而所有的 master/slave 集群也可共享全局配置。全局配置文件默认为/etc/masterha_default.cnf,其为可 选配置。如果仅监控一组 master/slave 集群,也可直接通过 application 的配置来提供各服务 器的默认配置信息。而每个 application 的配置文件路径为自定义,
例如,本示例中将使用 先创建目录
mkdir /etc/masterha vim /etc/masterha/app1.cnf
其内容如下所示:
[server default] user=mageedu password=123456 manager_workdir=/data/masterha/app1 manager_log=/data/masterha/app1/manager.log remote_workdir=/data/masterha/app1 ssh_user=root repl_user=repuser repl_password=repass ping_interval=1 [server1] hostname=192.168.42.152 candidate_master=1 [server2] hostname=192.168.42.153 candidate_master=1 [server3] hostname=192.168.42.154 candidate_master=1 [server4] hostname=192.168.42.155 candidate_master=1 (6).检测各节点间 ssh 互信通信配置是否 OK: masterha_check_ssh --conf=/etc/masterha/app1.cnf 输出信息最后一行类似如下信息,表示其通过检测。 [info] All SSH connection tests passed successfully. (7).修改主节点和从节点的配置 初始主节点 master 配置: [mysqld] innodb_file_per_table=ON skip_name_resolve=ON server_id = 1 log_bin = log-bin relay-log=relay-bin 所有 slave 节点依赖的配置: [mysqld] innodb_file_per_table=ON skip_name_resolve=ON server_id = 11 relay_log=relay-log read_only=ON log_bin = log-bin relay_log_purge=0
(8).检查管理的 MySQL 复制集群的连接配置参数是否 OK:
masterha_check_repl --conf=/etc/masterha/app1.cnf
输出信息如下所示,最后一行的“Health is OK”信息表示通过检测。
Fri Jul 14 15:40:37 2017 - [info] Checking replication health on 192.168.42.153.. Fri Jul 14 15:40:37 2017 - [info] ok. Fri Jul 14 15:40:37 2017 - [info] Checking replication health on 192.168.42.154.. Fri Jul 14 15:40:37 2017 - [info] ok. Fri Jul 14 15:40:37 2017 - [info] Checking replication health on 192.168.42.155.. Fri Jul 14 15:40:37 2017 - [info] ok. Fri Jul 14 15:40:37 2017 - [warning] master_ip_failover_script is not defined. Fri Jul 14 15:40:37 2017 - [warning] shutdown_script is not defined. Fri Jul 14 15:40:37 2017 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
(9).启动 MHA:
nohup masterha_manager --conf=/etc/masterha/app1.cnf >/data/masterha/app1/manager.log 2>&1 & ps -aux root 75846 1.6 2.1 296664 21652 pts/1 S 15:47 0:00 perl /usr/bin/masterha_manager --conf=/etc/masterha/app1.cnf root 75951 0.0 0.0 0 0 ? S 15:47 0:00 [kworker/u256:0] root 76088 0.0 0.1 151056 1836 pts/1 R+ 15:48 0:00 ps -aux
启动成功后,可通过如下命令来查看 master 节点的状态。
[root@node1 ~]# masterha_check_status --conf=/etc/masterha/app1.cnf app1 (pid:75846) is running(0:PING_OK), master:192.168.42.152
上面的信息中“app1 (pid:4978) is running(0:PING_OK)”表示 MHA 服务运行 OK,否则,则 会显示为类似“app1 is stopped(1:NOT_RUNNING).”。
如果要停止 MHA,需要使用 masterha_stop 命令。
masterha_stop --conf=/etc/masterha/app1.cnf Stopped app1 successfully. [1]+ Exit 1 nohup masterha_manager --conf=/etc/masterha/app1.cnf > /data/masterha/app1/manager.log 2>&1
(10).测试故障转移
1.在 master 节点关闭 mariadb 服务
killall -9 mysqld mysqld_safe
我们再一次去node1查看
[root@node1 ~]# masterha_check_status --conf=/etc/masterha/app1.cnf app1 is stopped(2:NOT_RUNNING). [1]+ Done nohup masterha_manager --conf=/etc/masterha/app1.cnf > /data/masterha/app1/manager.log 2>&1
2.在 manager 节点查看日志
cat /data/masterha/app1/manager.log 日 志 文 件 中 出 现 的 如 下 信 息 , 表 示 manager 检 测 到 192.168.42.152 节点故障,而后自动执行故障转移,将 192.168.42.153 提升为了主节点。
----- Failover Report ----- app1: MySQL Master failover 192.168.42.152(192.168.42.152:3306) to 192.168.42.153(192.168.42.153:3306) succeeded Master 192.168.42.152(192.168.42.152:3306) is down! Check MHA Manager logs at node1:/data/masterha/app1/manager.log for details. Started automated(non-interactive) failover. The latest slave 192.168.42.153(192.168.42.153:3306) has all relay logs for recovery. Selected 192.168.42.153(192.168.42.153:3306) as a new master. 192.168.42.153(192.168.42.153:3306): OK: Applying all logs succeeded. 192.168.42.155(192.168.42.155:3306): This host has the latest relay log events. 192.168.42.154(192.168.42.154:3306): This host has the latest relay log events. Generating relay diff files from the latest slave succeeded. 192.168.42.155(192.168.42.155:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.42.153(192.168.42.153:3306) 192.168.42.154(192.168.42.154:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.42.153(192.168.42.153:3306) 192.168.42.153(192.168.42.153:3306): Resetting slave info succeeded. Master failover to 192.168.42.153(192.168.42.153:3306) completed successfully.
注意,故障转移完成后,manager 将会自动停止,此时使用 masterha_check_status 命令检测 将会遇到错误提示,如下所示。
masterha_check_status --conf=/etc/masterha/app1.cnf app1 is stopped(2:NOT_RUNNING).
- 提供新的从节点以修复复制集群
(1).在新的主节点,备份数据
mysqldump -uroot -x -R -E --triggers --master-data=2 --all-databases > alldb.sql scp alldb.sql root@192.168.42.152:~
(2).node3节点操作
清空所有的数据
rm -rf /var/lib/mysql/*
将原来主节点的配置更改为从配置
vim /etc/my.cnf.d/server.conf [mysqld] innodb_file_per_table=ON skip_name_resolve=ON server_id = 1 relay_log=relay-log read_only=ON log_bin = log-bin relay_log_purge=0
启动mariadb
systemctl start mariadb
导入数据
mysql < /root/alldb.sql
查看复制点
head -30 /root/alldb.sql 得到复制点 CHANGE MASTER TO MASTER_LOG_FILE='log-bin.000001', MASTER_LOG_POS=328;
登录mysql,连接进行主从复制
CHANGE MASTER TO MASTER_HOST='192.168.42.153',MASTER_USER='repuser',MASTER_PASSWORD='repass',MASTER_LOG_FILE='log-bin.000001',MASTER_LOG_POS=328; START SLAVE ; SHOW SLAVE STATUS\G;
在现在的主节点删除一个库,查看一下子
MariaDB [(none)]> drop database hellowword; Query OK, 0 rows affected (0.00 sec)
node3节点查看:
MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | test | +--------------------+ 4 rows in set (0.00 sec)
我们可以看到库被删了,因此我们的故障转移到恢复已经成功
我们先在目前的mariadb主节点上flush privileges,然后去manage节点操作
步骤同上面的
(8).检查管理的 MySQL 复制集群的连接配置参数是否 OK:
(9).启动 MHA:
一样
原有 master 节点故障后,需要重新准备好一个新的 MySQL 节点。基于来自于 master 节点 的备份恢复数据后,将其配置为新的 master 的从节点即可。注意,新加入的节点如果为新 增节点,其 IP 地址要配置为原来 master 节点的 IP,否则,还需要修改 app1.cnf 中相应的 ip 地址。随后再次启动 manager,并再次检测其状态。
后续的mha高可用是集成在proxysql的高可用里面的,这个我们以后再续.
原创文章,作者:renjin,如若转载,请注明出处:http://www.178linux.com/81748