在mysql的復(fù)制,當主服務(wù)崩潰了,利用mha實現(xiàn)主服務(wù)自動切換,并能使其他從服務(wù)切換到新的主機。下面是部署步驟
(1)準備三機器:主服務(wù) 192.168.8.120,備主 192.168.8.121 ,從服務(wù)和管理節(jié)點
192.168.8.122
(2)修改各臺主機名 如 管理節(jié)點192.168.8.122
cat /etc/hosts
[root@centos3 mha]# more /etc/hosts
127.0.0.1 localhost
192.168.8.120 centos1
192.168.8.121 centos2
192.168.8.122 centos3
(3)數(shù)據(jù)節(jié)點
安裝mha4mysql-node-0.53.tar.gz
mha4mysql-manager-0.53.tar.gz,由于mha4mysql-node 依賴 perl-DBD-MySQL,
mha4mysql-manager依賴
perl-Config-Tiny perl-Params-Validate perl-Log-Dispatch perl-Parallel-ForkManager 。所以現(xiàn)在這些依賴包。實驗使用yum 安裝。
對三臺mariadb數(shù)據(jù)節(jié)點只需安裝mha4mysql-node-0.53.tar.gz ,本文沒有寫mariadb的安裝以及復(fù)制。
[root@centos1mha]#
rpm -ivh
http://dl./pub/epel/5/i386/epel-release-5-4.noarch.rpm
[root@centos1mha]#
yum -y install perl-DBD-MySQL
ncftp
[root@centos1mha]#
tar -zxf
mha4mysql-node-0.53.tar.gz
[root@centos1mha]# cd
mha4mysql-node-0.53
[root@centos1mha]#
perl Makefile.PL
[root@centos1mha]#
make && make install
(4)管理節(jié)點
[root@sh-gs-dbmg0227 ~]# rpm -ivh
http://dl./pub/epel/5/i386/epel-release-5-4.noarch.rpm
//這個是centos5.x 如果是6.x rpm -ivh
http://dl./pub/epel/6/i386/epel-release-6-8.noarch.rpm
[root@centos3 mha]# yum -y install
perl-DBD-MySQL ncftp
[root@centos3 mha]# tar -zxf
mha4mysql-node-0.53.tar.gz
[root@centos3 mha]# cd mha4mysql-node-0.53
[root@centos3 mha]#
perl Makefile.PL
[root@centos3 mha]#
make && make install
[root@centos3 mha]#
yum -y install perl-Config-Tiny perl-Params-Validate perl-Log-Dispatch perl-Parallel-ForkManager perl-Config-IniFiles
[root@centos3 mha]# tar -zxf
mha4mysql-manager-0.53.tar.gz
[root@centos3 mha]#
perl Makefile.PL
如果在該過程中出現(xiàn)下面錯誤
Can't locate ExtUtils/MakeMaker.pm in @INC (@INC contains: inc
/usr/local/lib64/perl5 /usr/local/share/perl5
/usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl
/usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Can.pm
line 6.
解決方法:
yum install
perl-CPAN
[root@centos3 mha]#
make && make install
[root@centos3
mha]# mkdir
/etc/masterha
[root@centos3
mha]# mkdir -p
/master/app1
[root@centos3
mha]# mkdir -p
/scripts
[root@centos3
mha]# cp samples/conf/*
/etc/masterha/
[root@centos3
mha]# cp
samples/scripts/*
/scripts
配置管理節(jié)點
[root@centos3
mha]#
more
/etc/masterha/masterha_default.cnf
[server default]
user=root
password=123456
ssh_user=root
repl_user=repl
repl_password=qwert
master_binlog_dir= /app/mysql
remote_workdir=/app/mha
secondary_check_script= masterha_secondary_check
-s
192.168.12.234
-s
192.168.12.232
ping_interval=1
master_ip_failover_script=/scripts/master_ip_failover
#shutdown_script=/scripts/power_manager
report_script= /scripts/send_report
master_ip_online_change_script=
/scripts/master_ip_online_change
[root@centos3
mha]# more
/etc/masterha/app1.cnf
[server default]
manager_workdir=/app/mha
manager_log=/app/mha/manager.log
[server1]
hostname=192.168.8.120
candidate_master=1
[server2]
hostname=192.168.8.121
candidate_master=1
[server3]
hostname=192.168.8.122
no_master=1
(5)在mysql 添加用戶,復(fù)制設(shè)置
mysql 主節(jié)點
grant replication slave on *.* to identified
by 'qwert';
grant all on *.* to identified
by '123456';
備主節(jié)點
grant replication slave on *.* to identified
by 'qwert';
grant all on *.* to identified
by '123456';
set read_only=1
set relay_log_purge=0
從節(jié)點
grant all on *.* to identified
by '123456';
set read_only=1
set relay_log_purge=0
(6)配置ssh
[[root@centos3~#
ssh-keygen -t rsa
[root@centos3
~]# ssh-copy-id -i
.ssh/id_rsa.pub root@192.168.8.120
[root@centos3~]#
ssh-copy-id -i .ssh/id_rsa.pub root@192.168.8.121
[root@centos1
~]# ssh-keygen -t rsa
[root@centos1
~]# ssh-copy-id -i
.ssh/id_rsa.pub root@192.168.8.121
[root@centos1
~]# ssh-copy-id -i
.ssh/id_rsa.pub root@192.168.8.122
[root@centos2
~]# ssh-keygen -t
rsa
[root@centos2
~]# ssh-copy-id -i
.ssh/id_rsa.pub root@192.168.8.120
[root@centos2
~]#
ssh-copy-id -i .ssh/id_rsa.pub root@192.168.8.122
(7)測試ssh
[root@centos3 mha]#
masterha_check_ssh
--global_conf=/etc/masterha/masterha_default.cnf
--conf=/etc/masterha/app1.cnf
Sat Aug 10 06:15:39 2013 - [info] Reading default configuratoins
from /etc/masterha/masterha_default.cnf..
Sat Aug 10 06:15:39 2013 - [info] Reading application default
configurations from /etc/masterha/app1.cnf..
Sat Aug 10 06:15:39 2013 - [info] Reading server configurations
from /etc/masterha/app1.cnf..
Sat Aug 10 06:15:39 2013 - [info] Starting SSH connection
tests..
Sat Aug 10 06:15:42 2013 - [debug]
Sat Aug 10 06:15:39 2013 - [debug] Connecting via
SSH from root@192.168.8.120(192.168.8.120:22) to
root@192.168.8.121(192.168.8.121:22)..
Sat Aug 10 06:15:41 2013 -
[debug] ok.
Sat Aug 10 06:15:41 2013 - [debug] Connecting via
SSH from root@192.168.8.120(192.168.8.120:22) to
root@192.168.8.122(192.168.8.122:22)..
Sat Aug 10 06:15:42 2013 -
[debug] ok.
Sat Aug 10 06:15:43 2013 - [debug]
Sat Aug 10 06:15:40 2013 - [debug] Connecting via
SSH from root@192.168.8.121(192.168.8.121:22) to
root@192.168.8.120(192.168.8.120:22)..
Sat Aug 10 06:15:41 2013 -
[debug] ok.
Sat Aug 10 06:15:41 2013 - [debug] Connecting via
SSH from root@192.168.8.121(192.168.8.121:22) to
root@192.168.8.122(192.168.8.122:22)..
Sat Aug 10 06:15:43 2013 -
[debug] ok.
Sat Aug 10 06:15:44 2013 - [debug]
Sat Aug 10 06:15:40 2013 - [debug] Connecting via
SSH from root@192.168.8.122(192.168.8.122:22) to
root@192.168.8.120(192.168.8.120:22)..
Sat Aug 10 06:15:42 2013 -
[debug] ok.
Sat Aug 10 06:15:42 2013 - [debug] Connecting via
SSH from root@192.168.8.122(192.168.8.122:22) to
root@192.168.8.121(192.168.8.121:22)..
Sat Aug 10 06:15:44 2013 -
[debug] ok.
Sat Aug 10 06:15:44 2013 - [info] All SSH connection tests passed
successfully.
(8)測試復(fù)制
[root@centos3 mha]# masterha_check_repl
--global_conf=/etc/masterha/masterha_default.cnf
--conf=/etc/masterha/app1.cnf
Sat Aug 10 06:26:13 2013 - [info] Reading default configuratoins
from /etc/masterha/masterha_default.cnf..
Sat Aug 10 06:26:13 2013 - [info] Reading application default
configurations from /etc/masterha/app1.cnf..
Sat Aug 10 06:26:13 2013 - [info] Reading server configurations
from /etc/masterha/app1.cnf..
Sat Aug 10 06:26:13 2013 - [info] MHA::MasterMonitor version
0.53.
Sat Aug 10 06:26:13 2013 - [info] Dead Servers:
Sat Aug 10 06:26:13 2013 - [info] Alive Servers:
Sat Aug 10 06:26:13 2013 -
[info]
192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:26:13 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Sat Aug 10 06:26:13 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Sat Aug 10 06:26:13 2013 - [info] Alive Slaves:
Sat Aug 10 06:26:13 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:26:13 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:26:13 2013 -
[info]
Primary candidate for the new Master (candidate_master is
set)
Sat Aug 10 06:26:13 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:26:13 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:26:13 2013 -
[info]
Not candidate for the new Master (no_master is set)
Sat Aug 10 06:26:13 2013 - [info] Current Alive Master:
192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:26:13 2013 - [info] Checking slave
configurations..
Sat Aug 10 06:26:13 2013 - [info] Checking replication filtering
settings..
Sat Aug 10 06:26:13 2013 - [info] binlog_do_db= ,
binlog_ignore_db=
Sat Aug 10 06:26:13 2013 - [info] Replication
filtering check ok.
Sat Aug 10 06:26:13 2013 - [info] Starting SSH connection
tests..
Sat Aug 10 06:26:17 2013 - [info] All SSH connection tests passed
successfully.
Sat Aug 10 06:26:17 2013 - [info] Checking MHA Node version..
Sat Aug 10 06:26:18 2013 - [info] Version check
ok.
Sat Aug 10 06:26:18 2013 - [info] Checking SSH publickey
authentication settings on the current master..
Sat Aug 10 06:26:18 2013 - [info] HealthCheck: SSH to 192.168.8.120
is reachable.
Sat Aug 10 06:26:18 2013 - [info] Master MHA Node version is
0.53.
Sat Aug 10 06:26:18 2013 - [info] Checking recovery script
configurations on the current master..
Sat Aug 10 06:26:18 2013 -
[info] Executing command:
save_binary_logs --command=test --start_pos=4
--binlog_dir=/app/mysql
--output_file=/app/mha/save_binary_logs_test --manager_version=0.53
--start_file=mysql-bin.000010
Sat Aug 10 06:26:18 2013 -
[info] Connecting to
root@192.168.8.120(192.168.8.120)..
Creating /app/mha if not
exists..
ok.
Checking output directory is accessible or
not..
ok.
Binlog found at /app/mysql, up to
mysql-bin.000010
Sat Aug 10 06:26:18 2013 - [info] Master setting check done.
Sat Aug 10 06:26:18 2013 - [info] Checking SSH publickey
authentication and checking recovery script configurations on all
alive slave servers..
Sat Aug 10 06:26:18 2013 -
[info] Executing command :
apply_diff_relay_logs --command=test --slave_user=root
--slave_host=192.168.8.121 --slave_ip=192.168.8.121
--slave_port=3306 --workdir=/app/mha --target_version=5.5.29-log
--manager_version=0.53
--relay_log_info=/app/mysql/relay-log.info
--relay_dir=/app/mysql/ --slave_pass=xxx
Sat Aug 10 06:26:18 2013 -
[info] Connecting to
root@192.168.8.121(192.168.8.121:22)..
Checking slave recovery environment
settings..
Opening
/app/mysql/relay-log.info ... ok.
Relay log
found at /app/mysql, up to mysql-relay-bin.000004
Temporary
relay log file is /app/mysql/mysql-relay-bin.000004
Testing
mysql connection and privileges.. done.
Testing
mysqlbinlog output.. done.
Cleaning up
test file(s).. done.
Sat Aug 10 06:26:19 2013 -
[info] Executing command :
apply_diff_relay_logs --command=test --slave_user=root
--slave_host=192.168.8.122 --slave_ip=192.168.8.122
--slave_port=3306 --workdir=/app/mha --target_version=5.5.29-log
--manager_version=0.53
--relay_log_info=/app/mysql/relay-log.info
--relay_dir=/app/mysql/ --slave_pass=xxx
Sat Aug 10 06:26:19 2013 -
[info] Connecting to
root@192.168.8.122(192.168.8.122:22)..
Checking slave recovery environment
settings..
Opening
/app/mysql/relay-log.info ... ok.
Relay log
found at /app/mysql, up to mysql-relay-bin.000004
Temporary
relay log file is /app/mysql/mysql-relay-bin.000004
Testing
mysql connection and privileges.. done.
Testing
mysqlbinlog output.. done.
Cleaning up
test file(s).. done.
Sat Aug 10 06:26:20 2013 - [info] Slaves settings check done.
Sat Aug 10 06:26:20 2013 - [info]
192.168.8.120 (current master)
+--192.168.8.121
+--192.168.8.122
Sat Aug 10 06:26:20 2013 - [info] Checking replication health on
192.168.8.121..
Sat Aug 10 06:26:20 2013 - [info] ok.
Sat Aug 10 06:26:20 2013 - [info] Checking replication health on
192.168.8.122..
Sat Aug 10 06:26:20 2013 - [info] ok.
Sat Aug 10 06:26:20 2013 - [info] Checking
master_ip_failover_script status:
Sat Aug 10 06:26:20 2013 -
[info]
/scripts/master_ip_failover --command=status --ssh_user=root
--orig_master_host=192.168.8.120 --orig_master_ip=192.168.8.120
--orig_master_port=3306
Sat Aug 10 06:26:20 2013 - [info] OK.
Sat Aug 10 06:26:20 2013 - [warning] shutdown_script is not
defined.
Sat Aug 10 06:26:20 2013 - [info] Got exit code 0 (Not master
dead).
MySQL Replication Health is OK.
(9)啟動management
[root@centos3 mysql]# nohup masterha_manager
--global-conf=/etc/masterha/masterha_default.cnf
--conf=/etc/masterha/app1.cnf
Sat Aug 10 06:29:36 2013 - [info] MHA::MasterMonitor version
0.53.
Sat Aug 10 06:29:37 2013 - [info] Dead Servers:
Sat Aug 10 06:29:37 2013 - [info] Alive Servers:
Sat Aug 10 06:29:37 2013 -
[info]
192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:29:37 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Sat Aug 10 06:29:37 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Sat Aug 10 06:29:37 2013 - [info] Alive Slaves:
Sat Aug 10 06:29:37 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:29:37 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:29:37 2013 -
[info]
Primary candidate for the new Master (candidate_master is
set)
Sat Aug 10 06:29:37 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:29:37 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:29:37 2013 -
[info]
Not candidate for the new Master (no_master is set)
Sat Aug 10 06:29:37 2013 - [info] Current Alive Master:
192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:29:37 2013 - [info] Checking slave
configurations..
Sat Aug 10 06:29:37 2013 - [info] Checking replication filtering
settings..
Sat Aug 10 06:29:37 2013 - [info] binlog_do_db= ,
binlog_ignore_db=
Sat Aug 10 06:29:37 2013 - [info] Replication
filtering check ok.
Sat Aug 10 06:29:37 2013 - [info] Starting SSH connection
tests..
Sat Aug 10 06:29:40 2013 - [info] All SSH connection tests passed
successfully.
Sat Aug 10 06:29:40 2013 - [info] Checking MHA Node version..
Sat Aug 10 06:29:41 2013 - [info] Version check
ok.
Sat Aug 10 06:29:41 2013 - [info] Checking SSH publickey
authentication settings on the current master..
Sat Aug 10 06:29:42 2013 - [info] HealthCheck: SSH to 192.168.8.120
is reachable.
Sat Aug 10 06:29:42 2013 - [info] Master MHA Node version is
0.53.
Sat Aug 10 06:29:42 2013 - [info] Checking recovery script
configurations on the current master..
Sat Aug 10 06:29:42 2013 -
[info] Executing command:
save_binary_logs --command=test --start_pos=4
--binlog_dir=/app/mysql
--output_file=/app/mha/save_binary_logs_test --manager_version=0.53
--start_file=mysql-bin.000010
Sat Aug 10 06:29:42 2013 -
[info] Connecting to
root@192.168.8.120(192.168.8.120)..
Creating /app/mha if not
exists..
ok.
Checking output directory is accessible or
not..
ok.
Binlog found at /app/mysql, up to
mysql-bin.000010
Sat Aug 10 06:29:42 2013 - [info] Master setting check done.
Sat Aug 10 06:29:42 2013 - [info] Checking SSH publickey
authentication and checking recovery script configurations on all
alive slave servers..
Sat Aug 10 06:29:42 2013 -
[info] Executing command :
apply_diff_relay_logs --command=test --slave_user=root
--slave_host=192.168.8.121 --slave_ip=192.168.8.121
--slave_port=3306 --workdir=/app/mha --target_version=5.5.29-log
--manager_version=0.53
--relay_log_info=/app/mysql/relay-log.info
--relay_dir=/app/mysql/ --slave_pass=xxx
Sat Aug 10 06:29:42 2013 -
[info] Connecting to
root@192.168.8.121(192.168.8.121:22)..
Checking slave recovery environment
settings..
Opening
/app/mysql/relay-log.info ... ok.
Relay log
found at /app/mysql, up to mysql-relay-bin.000004
Temporary
relay log file is /app/mysql/mysql-relay-bin.000004
Testing
mysql connection and privileges.. done.
Testing
mysqlbinlog output.. done.
Cleaning up
test file(s).. done.
Sat Aug 10 06:29:43 2013 -
[info] Executing command :
apply_diff_relay_logs --command=test --slave_user=root
--slave_host=192.168.8.122 --slave_ip=192.168.8.122
--slave_port=3306 --workdir=/app/mha --target_version=5.5.29-log
--manager_version=0.53
--relay_log_info=/app/mysql/relay-log.info
--relay_dir=/app/mysql/ --slave_pass=xxx
Sat Aug 10 06:29:43 2013 -
[info] Connecting to
root@192.168.8.122(192.168.8.122:22)..
Checking slave recovery environment
settings..
Opening
/app/mysql/relay-log.info ... ok.
Relay log
found at /app/mysql, up to mysql-relay-bin.000004
Temporary
relay log file is /app/mysql/mysql-relay-bin.000004
Testing
mysql connection and privileges.. done.
Testing
mysqlbinlog output.. done.
Cleaning up
test file(s).. done.
Sat Aug 10 06:29:43 2013 - [info] Slaves settings check done.
Sat Aug 10 06:29:43 2013 - [info]
192.168.8.120 (current master)
+--192.168.8.121
+--192.168.8.122
Sat Aug 10 06:29:43 2013 - [info] Checking
master_ip_failover_script status:
Sat Aug 10 06:29:43 2013 -
[info]
/scripts/master_ip_failover --command=status --ssh_user=root
--orig_master_host=192.168.8.120 --orig_master_ip=192.168.8.120
--orig_master_port=3306
Sat Aug 10 06:29:44 2013 - [info] OK.
Sat Aug 10 06:29:44 2013 - [warning] shutdown_script is not
defined.
Sat Aug 10 06:29:44 2013 - [info] Set master ping interval 1
seconds.
Sat Aug 10 06:29:44 2013 - [info] Set secondary check script:
masterha_secondary_check -s 192.168.8.121 -s 192.168.8.122
Sat Aug 10 06:29:44 2013 - [info] Starting ping health check on
192.168.8.120(192.168.8.120:3306)..
Sat Aug 10 06:29:44 2013 - [info] Ping(SELECT) succeeded, waiting
until MySQL doesn't respond..
[root@sh-gs-dbmg0227 /]# masterha_check_status
--conf=/etc/masterha/app1.cnf
app1 (pid:7127) is running(0:PING_OK),
master:192.168.8.121
(10)在備節(jié)點,從節(jié)點 執(zhí)行定期刪除中繼日志
00 00 * * * /usr/local/bin/purge_relay_logs –user=root
–password=123456 –disable_relay_log_purge >>
/masterha/purge_relay_logs.log 2>&1
(11)測試 關(guān)閉mysql主服務(wù)服務(wù),主機宕機。主服務(wù)是否自動切換。
Sat Aug 10 06:29:36 2013 - [info] MHA::MasterMonitor version
0.53.
Sat Aug 10 06:29:37 2013 - [info] Dead Servers:
Sat Aug 10 06:29:37 2013 - [info] Alive Servers:
Sat Aug 10 06:29:37 2013 -
[info]
192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:29:37 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Sat Aug 10 06:29:37 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Sat Aug 10 06:29:37 2013 - [info] Alive Slaves:
Sat Aug 10 06:29:37 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:29:37 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:29:37 2013 -
[info]
Primary candidate for the new Master (candidate_master is
set)
Sat Aug 10 06:29:37 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:29:37 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:29:37 2013 -
[info]
Not candidate for the new Master (no_master is set)
Sat Aug 10 06:29:37 2013 - [info] Current Alive Master:
192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:29:37 2013 - [info] Checking slave
configurations..
Sat Aug 10 06:29:37 2013 - [info] Checking replication filtering
settings..
Sat Aug 10 06:29:37 2013 - [info] binlog_do_db= ,
binlog_ignore_db=
Sat Aug 10 06:29:37 2013 - [info] Replication
filtering check ok.
Sat Aug 10 06:29:37 2013 - [info] Starting SSH connection
tests..
Sat Aug 10 06:29:40 2013 - [info] All SSH connection tests passed
successfully.
Sat Aug 10 06:29:40 2013 - [info] Checking MHA Node version..
Sat Aug 10 06:29:41 2013 - [info] Version check
ok.
Sat Aug 10 06:29:41 2013 - [info] Checking SSH publickey
authentication settings on the current master..
Sat Aug 10 06:29:42 2013 - [info] HealthCheck: SSH to 192.168.8.120
is reachable.
Sat Aug 10 06:29:42 2013 - [info] Master MHA Node version is
0.53.
Sat Aug 10 06:29:42 2013 - [info] Checking recovery script
configurations on the current master..
Sat Aug 10 06:29:42 2013 -
[info] Executing command:
save_binary_logs --command=test --start_pos=4
--binlog_dir=/app/mysql
--output_file=/app/mha/save_binary_logs_test --manager_version=0.53
--start_file=mysql-bin.000010
Sat Aug 10 06:29:42 2013 -
[info] Connecting to
root@192.168.8.120(192.168.8.120)..
Creating /app/mha if not
exists..
ok.
Checking output directory is accessible or
not..
ok.
Binlog found at /app/mysql, up to
mysql-bin.000010
Sat Aug 10 06:29:42 2013 - [info] Master setting check done.
Sat Aug 10 06:29:42 2013 - [info] Checking SSH publickey
authentication and checking recovery script configurations on all
alive slave servers..
Sat Aug 10 06:29:42 2013 -
[info] Executing command :
apply_diff_relay_logs --command=test --slave_user=root
--slave_host=192.168.8.121 --slave_ip=192.168.8.121
--slave_port=3306 --workdir=/app/mha --target_version=5.5.29-log
--manager_version=0.53
--relay_log_info=/app/mysql/relay-log.info
--relay_dir=/app/mysql/ --slave_pass=xxx
Sat Aug 10 06:29:42 2013 -
[info] Connecting to
root@192.168.8.121(192.168.8.121:22)..
Checking slave recovery environment
settings..
Opening
/app/mysql/relay-log.info ... ok.
Relay log
found at /app/mysql, up to mysql-relay-bin.000004
Temporary
relay log file is /app/mysql/mysql-relay-bin.000004
Testing
mysql connection and privileges.. done.
Testing
mysqlbinlog output.. done.
Cleaning up
test file(s).. done.
Sat Aug 10 06:29:43 2013 -
[info] Executing command :
apply_diff_relay_logs --command=test --slave_user=root
--slave_host=192.168.8.122 --slave_ip=192.168.8.122
--slave_port=3306 --workdir=/app/mha --target_version=5.5.29-log
--manager_version=0.53
--relay_log_info=/app/mysql/relay-log.info
--relay_dir=/app/mysql/ --slave_pass=xxx
Sat Aug 10 06:29:43 2013 -
[info] Connecting to
root@192.168.8.122(192.168.8.122:22)..
Checking slave recovery environment
settings..
Opening
/app/mysql/relay-log.info ... ok.
Relay log
found at /app/mysql, up to mysql-relay-bin.000004
Temporary
relay log file is /app/mysql/mysql-relay-bin.000004
Testing
mysql connection and privileges.. done.
Testing
mysqlbinlog output.. done.
Cleaning up
test file(s).. done.
Sat Aug 10 06:29:43 2013 - [info] Slaves settings check done.
Sat Aug 10 06:29:43 2013 - [info]
192.168.8.120 (current master)
+--192.168.8.121
+--192.168.8.122
Sat Aug 10 06:29:43 2013 - [info] Checking
master_ip_failover_script status:
Sat Aug 10 06:29:43 2013 -
[info]
/scripts/master_ip_failover --command=status --ssh_user=root
--orig_master_host=192.168.8.120 --orig_master_ip=192.168.8.120
--orig_master_port=3306
Sat Aug 10 06:29:44 2013 - [info] OK.
Sat Aug 10 06:29:44 2013 - [warning] shutdown_script is not
defined.
Sat Aug 10 06:29:44 2013 - [info] Set master ping interval 1
seconds.
Sat Aug 10 06:29:44 2013 - [info] Set secondary check script:
masterha_secondary_check -s 192.168.8.121 -s 192.168.8.122
Sat Aug 10 06:29:44 2013 - [info] Starting ping health check on
192.168.8.120(192.168.8.120:3306)..
Sat Aug 10 06:29:44 2013 - [info] Ping(SELECT) succeeded, waiting
until MySQL doesn't respond..
Sat Aug 10 06:31:27 2013 - [warning] Got error on MySQL select
ping: 2006 (MySQL server has gone away)
Sat Aug 10 06:31:27 2013 - [info] Executing seconary network check
script: masterha_secondary_check -s 192.168.8.121 -s
192.168.8.122 --user=root
--master_host=192.168.8.120
--master_ip=192.168.8.120
--master_port=3306
Sat Aug 10 06:31:27 2013 - [info] Executing SSH check script:
save_binary_logs --command=test --start_pos=4
--binlog_dir=/app/mysql
--output_file=/app/mha/save_binary_logs_test --manager_version=0.53
--binlog_prefix=mysql-bin
Sat Aug 10 06:31:28 2013 - [info] HealthCheck: SSH to 192.168.8.120
is reachable.
Sat Aug 10 06:31:28 2013 - [warning] Got error on MySQL connect:
2013 (Lost connection to MySQL server at 'reading initial
communication packet', system error: 111)
Sat Aug 10 06:31:28 2013 - [warning] Connection failed 1
time(s)..
Monitoring server 192.168.8.121 is reachable, Master is not
reachable from 192.168.8.121. OK.
Monitoring server 192.168.8.122 is reachable, Master is not
reachable from 192.168.8.122. OK.
Sat Aug 10 06:31:29 2013 - [info] Master is not reachable from all
other monitoring servers. Failover should start.
Sat Aug 10 06:31:29 2013 - [warning] Got error on MySQL connect:
2013 (Lost connection to MySQL server at 'reading initial
communication packet', system error: 111)
Sat Aug 10 06:31:29 2013 - [warning] Connection failed 2
time(s)..
Sat Aug 10 06:31:30 2013 - [warning] Got error on MySQL connect:
2013 (Lost connection to MySQL server at 'reading initial
communication packet', system error: 111)
Sat Aug 10 06:31:30 2013 - [warning] Connection failed 3
time(s)..
Sat Aug 10 06:31:30 2013 - [warning] Master is not reachable from
health checker!
Sat Aug 10 06:31:30 2013 - [warning] Master
192.168.8.120(192.168.8.120:3306) is not reachable!
Sat Aug 10 06:31:30 2013 - [warning] SSH is reachable.
Sat Aug 10 06:31:30 2013 - [info] Connecting to a master server
failed. Reading configuration file
/etc/masterha/masterha_default.cnf and /etc/masterha/app1.cnf
again, and trying to connect to all servers to check server
status..
Sat Aug 10 06:31:30 2013 - [info] Reading default configuratoins
from /etc/masterha/masterha_default.cnf..
Sat Aug 10 06:31:30 2013 - [info] Reading application default
configurations from /etc/masterha/app1.cnf..
Sat Aug 10 06:31:30 2013 - [info] Reading server configurations
from /etc/masterha/app1.cnf..
Sat Aug 10 06:31:30 2013 - [info] Dead Servers:
Sat Aug 10 06:31:30 2013 -
[info]
192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:31:30 2013 - [info] Alive Servers:
Sat Aug 10 06:31:30 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Sat Aug 10 06:31:30 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Sat Aug 10 06:31:30 2013 - [info] Alive Slaves:
Sat Aug 10 06:31:30 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:31:30 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:31:30 2013 -
[info]
Primary candidate for the new Master (candidate_master is
set)
Sat Aug 10 06:31:30 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:31:30 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:31:30 2013 -
[info]
Not candidate for the new Master (no_master is set)
Sat Aug 10 06:31:30 2013 - [info] Checking slave
configurations..
Sat Aug 10 06:31:30 2013 - [info] Checking replication filtering
settings..
Sat Aug 10 06:31:30 2013 - [info] Replication
filtering check ok.
Sat Aug 10 06:31:30 2013 - [info] Master is down!
Sat Aug 10 06:31:30 2013 - [info] Terminating monitoring
script.
Sat Aug 10 06:31:30 2013 - [info] Got exit code 20 (Master
dead).
Sat Aug 10 06:31:30 2013 - [info] MHA::MasterFailover version
0.53.
Sat Aug 10 06:31:30 2013 - [info] Starting master failover.
Sat Aug 10 06:31:30 2013 - [info]
Sat Aug 10 06:31:30 2013 - [info] * Phase 1: Configuration Check
Phase..
Sat Aug 10 06:31:30 2013 - [info]
Sat Aug 10 06:31:30 2013 - [info] Dead Servers:
Sat Aug 10 06:31:30 2013 -
[info]
192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:31:30 2013 - [info] Checking master reachability via
mysql(double check)..
Sat Aug 10 06:31:30 2013 - [info] ok.
Sat Aug 10 06:31:30 2013 - [info] Alive Servers:
Sat Aug 10 06:31:30 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Sat Aug 10 06:31:30 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Sat Aug 10 06:31:30 2013 - [info] Alive Slaves:
Sat Aug 10 06:31:30 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:31:30 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:31:30 2013 -
[info]
Primary candidate for the new Master (candidate_master is
set)
Sat Aug 10 06:31:30 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:31:30 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:31:30 2013 -
[info]
Not candidate for the new Master (no_master is set)
Sat Aug 10 06:31:30 2013 - [info] ** Phase 1: Configuration Check
Phase completed.
Sat Aug 10 06:31:30 2013 - [info]
Sat Aug 10 06:31:30 2013 - [info] * Phase 2: Dead Master Shutdown
Phase..
Sat Aug 10 06:31:30 2013 - [info]
Sat Aug 10 06:31:30 2013 - [info] Forcing shutdown so that
applications never connect to the current master..
Sat Aug 10 06:31:30 2013 - [info] Executing master IP
deactivatation script:
Sat Aug 10 06:31:30 2013 -
[info]
/scripts/master_ip_failover --orig_master_host=192.168.8.120
--orig_master_ip=192.168.8.120 --orig_master_port=3306
--command=stopssh --ssh_user=root
Sat Aug 10 06:31:31 2013 - [info] done.
Sat Aug 10 06:31:31 2013 - [warning] shutdown_script is not set.
Skipping explicit shutting down of the dead master.
Sat Aug 10 06:31:31 2013 - [info] * Phase 2: Dead Master Shutdown
Phase completed.
Sat Aug 10 06:31:31 2013 - [info]
Sat Aug 10 06:31:31 2013 - [info] * Phase 3: Master Recovery
Phase..
Sat Aug 10 06:31:31 2013 - [info]
Sat Aug 10 06:31:31 2013 - [info] * Phase 3.1: Getting Latest
Slaves Phase..
Sat Aug 10 06:31:31 2013 - [info]
Sat Aug 10 06:31:31 2013 - [info] The latest binary log
file/position on all slaves is mysql-bin.000010:107
Sat Aug 10 06:31:31 2013 - [info] Latest slaves (Slaves that
received relay log files to the latest):
Sat Aug 10 06:31:31 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:31:31 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:31:31 2013 -
[info]
Primary candidate for the new Master (candidate_master is
set)
Sat Aug 10 06:31:31 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:31:31 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:31:31 2013 -
[info]
Not candidate for the new Master (no_master is set)
Sat Aug 10 06:31:31 2013 - [info] The oldest binary log
file/position on all slaves is mysql-bin.000010:107
Sat Aug 10 06:31:31 2013 - [info] Oldest slaves:
Sat Aug 10 06:31:31 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:31:31 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:31:31 2013 -
[info]
Primary candidate for the new Master (candidate_master is
set)
Sat Aug 10 06:31:31 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:31:31 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:31:31 2013 -
[info]
Not candidate for the new Master (no_master is set)
Sat Aug 10 06:31:31 2013 - [info]
Sat Aug 10 06:31:31 2013 - [info] * Phase 3.2: Saving Dead Master's
Binlog Phase..
Sat Aug 10 06:31:31 2013 - [info]
Sat Aug 10 06:31:31 2013 - [info] Fetching dead master's binary
logs..
Sat Aug 10 06:31:31 2013 - [info] Executing command on the dead
master 192.168.8.120(192.168.8.120:3306): save_binary_logs
--command=save --start_file=mysql-bin.000010
--start_pos=107 --binlog_dir=/app/mysql
--output_file=/app/mha/saved_master_binlog_from_192.168.8.120_3306_20130810063130.binlog
--handle_raw_binlog=1 --disable_log_bin=0
--manager_version=0.53
Creating /app/mha if not
exists..
ok.
Concat binary/relay logs from mysql-bin.000010
pos 107 to mysql-bin.000010 EOF into
/app/mha/saved_master_binlog_from_192.168.8.120_3306_20130810063130.binlog
..
Dumping binlog format description event, from
position 0 to 107.. ok.
Dumping effective binlog data from
/app/mysql/mysql-bin.000010 position 107 to tail(126).. ok.
Concat succeeded.
Sat Aug 10 06:31:32 2013 - [info] scp from
root@192.168.8.120:/app/mha/saved_master_binlog_from_192.168.8.120_3306_20130810063130.binlog
to
local:/app/mha/saved_master_binlog_from_192.168.8.120_3306_20130810063130.binlog
succeeded.
Sat Aug 10 06:31:32 2013 - [info] HealthCheck: SSH to 192.168.8.121
is reachable.
Sat Aug 10 06:31:33 2013 - [info] HealthCheck: SSH to 192.168.8.122
is reachable.
Sat Aug 10 06:31:33 2013 - [info]
Sat Aug 10 06:31:33 2013 - [info] * Phase 3.3: Determining New
Master Phase..
Sat Aug 10 06:31:33 2013 - [info]
Sat Aug 10 06:31:33 2013 - [info] Finding the latest slave that has
all relay logs for recovering other slaves..
Sat Aug 10 06:31:33 2013 - [info] All slaves received relay logs to
the same position. No need to resync each other.
Sat Aug 10 06:31:33 2013 - [info] Searching new master from
slaves..
Sat Aug 10 06:31:33 2013 - [info] Candidate
masters from the configuration file:
Sat Aug 10 06:31:33 2013 -
[info]
192.168.8.121(192.168.8.121:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:31:33 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:31:33 2013 -
[info]
Primary candidate for the new Master (candidate_master is
set)
Sat Aug 10 06:31:33 2013 - [info] Non-candidate
masters:
Sat Aug 10 06:31:33 2013 -
[info]
192.168.8.122(192.168.8.122:3306)
Version=5.5.29-log (oldest major version between slaves)
log-bin:enabled
Sat Aug 10 06:31:33 2013 -
[info]
Replicating from 192.168.8.120(192.168.8.120:3306)
Sat Aug 10 06:31:33 2013 -
[info]
Not candidate for the new Master (no_master is set)
Sat Aug 10 06:31:33 2013 - [info] Searching from
candidate_master slaves which have received the latest relay log
events..
Sat Aug 10 06:31:33 2013 - [info] New master is
192.168.8.121(192.168.8.121:3306)
Sat Aug 10 06:31:33 2013 - [info] Starting master failover..
Sat Aug 10 06:31:33 2013 - [info]
From:
192.168.8.120 (current master)
+--192.168.8.121
+--192.168.8.122
To:
192.168.8.121 (new master)
+--192.168.8.122 |