Friday, November 9, 2018

MySQL Replication Data Recovery using 'mysqlbinlog' - Part II

MySQL Replication Data Recovery using 'mysqlbinlog' - Part II

The previous post (PART-I)
http://mysqlhk.blogspot.com/2018/10/mysql-replication-recovery-from-binlog.html

It describes the Replication Recovery from binlog by using those binlog files to be treated as Relay Log.  The Relay Log mechanism when the server is startup, the recovery is the SQL_THREAD applier to apply data to the database.    Check on the PART-I post for details.

Part II is about using the MySQL utility "mysqlbinlog" to dump the content from binlog files and apply the SQL to the Database.

Documentation
https://dev.mysql.com/doc/refman/8.0/en/mysqlbinlog.html
https://dev.mysql.com/doc/refman/8.0/en/point-in-time-recovery.html

The following sections describe the tutorial for Replication Data Recovery using 'mysqlbinlog'. 

The tutorial includes the followings topics (Note: Using GTID replication)
1. Initialize MySQL Instances (3316, 3326) installation & Configure Replication (Master and Slave)
2. Create Database and Table (test1.mytable1) with 8 rows
3. Stop the IO_THREAD on the Slave Server to simulate the situation where Data cannot transport to the RELAY Log on Slave Server.  Creating more data on Master and using "FLUSH LOGS;" to switching log files on Master server.
4. Checking the Replication Status on Slave Server for the Last retrieved position and file on Master.
5. Using 'mysqlbinlog' utility to extract SQL from the Master's binlog files starting from the position and file registered on Slave status.
6. Apply the SQLs to the Slave Server to recover the data and Check the Slave Status

7. Resume Replication "START SLAVE IO_THREAD FOR CHANNEL '<channel name>';"

Installation & Configurationo Assumptions
1. basedir = /usr/local/mysql   : The MySQL Binary Package
2. datadir = /home/mysql/data/server1 : The Server 1 Data Directory
3. datadir = /home/mysql/data/server2 : The Server 2 Data Directory
4. Using GTID and Channel for the replication setup
5. OS user for executing the commands - OS user : 'mysql'  (Do not use root
6. Configuration Files   (/home/mysql/data/my1.cnf & /home/mysql/data/my2.cnf)

/home/mysql/data/my1.cnf
[mysqld]
server-id=1
datadir=/home/mysql/data/server1
basedir=/usr/local/mysql

port=3316
socket=/home/mysql/data/server1/mysqld.sock

mysqlx-port=33160
mysqlx-socket=/home/mysql/data/server1/mysqlx.sock

log-error=/home/mysql/data/server1/mysqld.error
log-bin=mysqllog
relay-log=relay.bin

gtid-mode=on
enforce-gtid-consistency
master_info_repository=TABLE
relay_log_info_repository=TABLE


/home/mysql/data/my2.cnf
[mysqld]
server-id=2
datadir=/home/mysql/data/server2
basedir=/usr/local/mysql

port=3326
socket=/home/mysql/data/server2/mysqld.sock

mysqlx-port=33260
mysqlx-socket=/home/mysql/data/server2/mysqlx.sock

log-error=/home/mysql/data/server2/mysqld.error
log-bin=mysqllog
relay-log=relay.bin

gtid-mode=on
enforce-gtid-consistency
master_info_repository=TABLE
relay_log_info_repository=TABLE


STEP 1 : Initialize MySQL Instances and Configure Replication

Initialize Data Folder for 2 instances

# /usr/local/mysql/bin/mysqld --defaults-file=/home/mysql/data/my1.cnf --initialize-insecure
# /usr/local/mysql/bin/mysqld --defaults-file=/home/mysql/data/my2.cnf --initialize-insecure

Start up MySQL Instances
# /usr/local/mysql/bin/mysqld --defaults-file=/home/mysql/data/my1.cnf &
# /usr/local/mysql/bin/mysqld --defaults-file=/home/mysql/data/my2.cnf &

 Configure Replication User on Master and Slave
# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3316 -e "
drop user if exists repl@'localhost';
create user repl@'localhost' identified with mysql_native_password by 'repl';
grant replication slave on *.* to repl@'localhost';
"

# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3326 -e "
drop user if exists repl@'localhost';
create user repl@'localhost' identified with mysql_native_password by 'repl';
grant replication slave on *.* to repl@'localhost';
"

# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3316 << EOL3316
reset master;
reset slave;
EOL3316

# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3326 << EOL3326
reset master;
reset slave;
EOL3326

# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3326 << EOL1

change master to
master_host='127.0.0.1',
master_user='repl',
master_password='repl',
master_port=3316,
master_auto_position=1
for channel 'channel1';

start slave for channel 'channel1';

show slave status for channel 'channel1'\G

EOL1

 STEP 2 : Create Database and Table (test1.mytable1) with 8 rows of data

# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3316

mysql> create database if not exists test1;
mysql> create table if not exists test1.mytable1 (f1 int not null auto_increment primary key, f2 varchar(20));

mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');



The data should now be replicated to Slave.  To show the data on Slave and Check the Replication Status, Execute :

# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3326 -e "select count(*) from test1.mytable1;"

 /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3326 -e "show slave status for channel 'channel1'\G"

STEP 3 : Stop the IO_THREAD on Slave Server

This is to simulate the connection between master and slave being disconnected.  So there is no more NEW update to RELAY LOG on Slave Server.   To do this :

# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3326 -e "stop slave IO_THREAD for channel 'channel1';"

Creating 16 more rows on Master Server, however they are not replicated to Slave Server.

# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3316

mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');

At this point, Master Server has 32 rows.  However the Slave Server has ONLY 16 rows.

Executing "FLUSH LOGS" to flush the binlog and switch to new logs.  Inserting 16 more rows.
# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3316

mysql> FLUSH LOGS;
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');

mysql> FLUSH LOGS;
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');

After executing 2 times of "FLUSH LOGS;" commands, there are 3 binlog files created in the datadir "/home/mysql/data/server1" of the Master Server.   They are
mysqllog.000001
mysqllog.000002
mysqllog.000003
mysqllog.index

Number of ROWS on MASTER : 48
Number of ROWS on SLAVE : 16

***************************************************************************************************************
***  To recover the DATA from BINLOG of the Master Server to Slave Server   
***************************************************************************************************************

STEP 4 : Check the status on Slave  Server for what the last position of the BINLOG is and which file is the LAST file on Master Server being used.

Assuming we have the binlog files from MASTER Files - from datadir "/home/mysql/data/server1"
-- mysqllog.000001
-- mysqllog.000002
-- mysqllog.000003
-- mysqllog.index

# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3326 -e "show slave status\G"
E.g. The output from command may look like this

 *************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 3316
                Connect_Retry: 60
              Master_Log_File: mysqllog.000001
          Read_Master_Log_Pos: 5422
 
              Relay_Log_File: relay-channel1.000002
                Relay_Log_Pos: 5634
        Relay_Master_Log_File: mysqllog.000001
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
...

...
       Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set: 98424208-e3f5-11e8-bf4f-0800271b198a:1-18
            Executed_Gtid_Set: 98424208-e3f5-11e8-bf4f-0800271b198a:1-18
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: channel1
           Master_TLS_Version:
       Master_public_key_path:
        Get_master_public_key: 0

 **************************************************
The Master Log File : mysqllog.000001
and Master Log Pos : 5422

This is the last position that the Slave IO_THREAD has read from Master Binlog file.

STEP 5 : Use 'mysqlbinlog' utility to extract SQL from MASTER Binlog files
Note : There are 3 binlog files from datadir "/home/mysql/data/server1"
-- mysqllog.000001
-- mysqllog.000002
-- mysqllog.000003
-- mysqllog.index

The command should include ALL binlog files starting from the one indicated by the Slave Status "Master Log Fie"

# cd /home/mysql/data/server1/
# /usr/local/mysql/bin/mysqlbinlog --start-position=5422 mysqllog.000001 mysqllog.000002 mysqllog.000003 > /tmp/my.sql

The output '/tmp/my.sql' contains ALL SQL statements with data after the transaction from position = 5422.   The FIRST mysqllog.000001 in the command is important which is the Master Log File from the Slave Status.

STEP 6 : Apply the SQL to the Slave Server and Check the Slave Status

# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3326 < /tmp/my.sql

# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3326 -e "show slave status\G"
E.g. The output from command may look like this

*************************** 1. row ***************************
               Slave_IO_State:
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 3316
                Connect_Retry: 60
              Master_Log_File: mysqllog.000001
          Read_Master_Log_Pos: 5422

               Relay_Log_File: relay-channel1.000002
                Relay_Log_Pos: 5634
        Relay_Master_Log_File: mysqllog.000001
             Slave_IO_Running: No
            Slave_SQL_Running: Yes
...
...
           Retrieved_Gtid_Set: 98424208-e3f5-11e8-bf4f-0800271b198a:1-18
            Executed_Gtid_Set: 98424208-e3f5-11e8-bf4f-0800271b198a:1-54

                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: channel1
           Master_TLS_Version:
       Master_public_key_path:
        Get_master_public_key: 0

************************************************************
Comparing the output from the previous Slave Status, the Master Log File and Post do not change.  But the Retrieved GTID_Set and Executed_Gtid_set indicated the execution has already applied the extra rows on the Master.

BEFORE data recovery from BINLOG
           Retrieved_Gtid_Set: 98424208-e3f5-11e8-bf4f-0800271b198a:1-18
            Executed_Gtid_Set: 98424208-e3f5-11e8-bf4f-0800271b198a:1-18

AFTER data recovery from BINLOG 
           Retrieved_Gtid_Set: 98424208-e3f5-11e8-bf4f-0800271b198a:1-18
            Executed_Gtid_Set: 98424208-e3f5-11e8-bf4f-0800271b198a:1-54


The outcome is from recovery is to get more transactions applied to slave.




STEP 7 :  Resume Replication

# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3326 -e "START SLAVE IO_THREAD FOR CHANNEL 'channel1';"

Creating extra 16 rows on MASTER and check data on SLAVE
# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3316

mysql> FLUSH LOGS;
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');
mysql>insert into test1.mytable1 (f2) values ('aaaaaaaaaaaaaaa');


Check the Slave Status
# /usr/local/mysql/bin/mysql -uroot -h127.0.0.1 -P3326 -e "show slave status\G"
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: repl
                  Master_Port: 3316
                Connect_Retry: 60
              Master_Log_File: mysqllog.000003
          Read_Master_Log_Pos: 10689
               Relay_Log_File: relay-channel1.000003
                Relay_Log_Pos: 5700
        Relay_Master_Log_File: mysqllog.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...
...
           Retrieved_Gtid_Set: 98424208-e3f5-11e8-bf4f-0800271b198a:1-18:55-72
            Executed_Gtid_Set: 98424208-e3f5-11e8-bf4f-0800271b198a:1-72
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name: channel1
           Master_TLS_Version:
       Master_public_key_path:
        Get_master_public_key: 0
*********************************************************************
The Replication Channel has already been resumed.  The IO and SQL Threads are up and running :
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

The Master_log_file and Read_Master_log_Pos has already been started with new number and file name:
               Master_Log_File: mysqllog.000003
          Read_Master_Log_Pos: 10689

The Retrieved_Gtid_Set and Executed_Gtid_Set on Slave Server has already been updated with latest information :
           Retrieved_Gtid_Set: 98424208-e3f5-11e8-bf4f-0800271b198a:1-18:55-72
            Executed_Gtid_Set: 98424208-e3f5-11e8-bf4f-0800271b198a:1-72


The Master and Slave Servers are UP and RUNNING.