大家好,我是小五同学,难免有点时间,闲暇之余,给大家分享下,前段时间遇到的一个,删除了从库的两条数据,而后,发现删错了,又删除了主库相同的两条数据,而引起的主从同步异常的告警。谨以此篇,提醒各位同学,在对数据库做任何操作之前。,一定要看清楚,避免引起不必要的麻烦,“淹死的都是会游泳的”。废话不多说。
一、资源信息
IP |
状态 |
192.168.153.128 |
MASTER |
192.168.153.129 |
SLAVE |
二、在搭建好主从复制的基础上,在主库操作,新建测试表,并 插入测试数据
这里,主从同步,基于GTID的同步,半同步复制,高可用等文章不在描述,后续会一同同步至博客,大家关注后看即可。
(root@localhost) [(none)]> CREATE DATABASE `test`
(root@localhost) [test]> use test;
Database changed
CREATE TABLE `a` (
`a` int(11) NOT NULL,
`b` int(11) DEFAULT NULL,
PRIMARY KEY (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
INSERT INTO `a` (`a`, `b`) VALUES (1, 2);
INSERT INTO `a` (`a`, `b`) VALUES (2, 3);
INSERT INTO `a` (`a`, `b`) VALUES (3, 4);
INSERT INTO `a` (`a`, `b`) VALUES (4, 5);
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`data` json DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
INSERT INTO `test` (`id`, `data`) VALUES (1, '{\"age\": 30, \"name\": \"张三\", \"address\": {\"city\": \"上海\", \"street\": \"朝阳区\"}}');
INSERT INTO `test` (`id`, `data`) VALUES (2, '{\"hobbies\": [\"basketball\", \"football\", \"reading\"]}');
(root@localhost) [test]> select * from a;
+---+------+
| a | b |
+---+------+
| 1 | 2 |
| 2 | 3 |
| 3 | 4 |
| 4 | 5 |
+---+------+
(root@localhost) [test]> select * from test;
+----+-------------------------------------------------------------------------------------+
| id | data |
+----+-------------------------------------------------------------------------------------+
| 1 | {"age": 30, "name": "张三", "address": {"city": "上海", "street": "朝阳区"}} |
| 2 | {"hobbies": ["basketball", "football", "reading"]} |
+----+-------------------------------------------------------------------------------------+
---- 查看从库数据是否同步成功
(root@localhost) [test]> select * from a;
+---+------+
| a | b |
+---+------+
| 1 | 2 |
| 2 | 3 |
| 3 | 4 |
| 4 | 5 |
+---+------+
(root@localhost) [test]> select * from test;
+----+-------------------------------------------------------------------------------------+
| id | data |
+----+-------------------------------------------------------------------------------------+
| 1 | {"age": 30, "name": "张三", "address": {"city": "上海", "street": "朝阳区"}} |
| 2 | {"hobbies": ["basketball", "football", "reading"]} |
+----+-------------------------------------------------------------------------------------+
三、在从库删除表a的一条数据和表test的一条数据
(root@localhost) [test]> DELETE from a where a='2';
Query OK, 1 row affected (0.01 sec)
(root@localhost) [test]> DELETE from test where id='2';
Query OK, 1 row affected (0.00 sec)
四、去主库删除同样的数据
(root@localhost) [test]> DELETE from a where a='2';
Query OK, 1 row affected (0.01 sec)
(root@localhost) [test]> DELETE from test where id='2';
Query OK, 1 row affected (0.00 sec)
五、查看主从复制状态
(root@localhost) [test]> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.153.128
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000163
Read_Master_Log_Pos: 1367
Relay_Log_File: mysql-relay-bin.000027
Relay_Log_Pos: 923
Relay_Master_Log_File: mysql-bin.000163
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1032
Last_Error: Could not execute Delete_rows event on table test.a; Can't find record in 'a', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000163, end_log_pos 995
Skip_Counter: 0
Exec_Master_Log_Pos: 749
Relay_Log_Space: 1963
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1032
Last_SQL_Error: Could not execute Delete_rows event on table test.a; Can't find record in 'a', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000163, end_log_pos 995
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: d26ec757-adba-11ed-997d-000c297ca098
Master_Info_File: mysql.slave_master_info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State:
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp: 240125 15:18:24
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
Master_public_key_path:
Get_master_public_key: 0
Network_Namespace:
1 row in set (0.00 sec)
发现 Slave_SQL_Running IO线程状态为 NO 报错:Could not execute Delete_rows event on table test.a; Can't find record in 'a', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000163, end_log_pos 995
意思就是 无法对表a 进行删除操作,因为这个数据不存在,是的 我们刚才删了。如果存在那就见鬼了
在从库根据复制状态查看具体错误信息,也可以从错误日志查看
(root@localhost) [test]> select * from performance_schema.replication_applier_status_by_worker\G
*************************** 1. row ***************************
CHANNEL_NAME:
WORKER_ID: 0
THREAD_ID: NULL
SERVICE_STATE: OFF
LAST_ERROR_NUMBER: 1032
LAST_ERROR_MESSAGE: Could not execute Delete_rows event on table test.a; Can't find record in 'a', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000163, end_log_pos 995
LAST_ERROR_TIMESTAMP: 2024-01-25 15:18:24.742522
LAST_APPLIED_TRANSACTION: ANONYMOUS
LAST_APPLIED_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2024-01-25 15:11:29.090045
LAST_APPLIED_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2024-01-25 15:11:29.090045
LAST_APPLIED_TRANSACTION_START_APPLY_TIMESTAMP: 2024-01-25 15:11:29.813081
LAST_APPLIED_TRANSACTION_END_APPLY_TIMESTAMP: 2024-01-25 15:11:29.815126
APPLYING_TRANSACTION: ANONYMOUS
APPLYING_TRANSACTION_ORIGINAL_COMMIT_TIMESTAMP: 2024-01-25 15:18:24.018427
APPLYING_TRANSACTION_IMMEDIATE_COMMIT_TIMESTAMP: 2024-01-25 15:18:24.018427
APPLYING_TRANSACTION_START_APPLY_TIMESTAMP: 2024-01-25 15:18:24.741955
LAST_APPLIED_TRANSACTION_RETRIES_COUNT: 0
LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_NUMBER: 0
LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_MESSAGE:
LAST_APPLIED_TRANSACTION_LAST_TRANSIENT_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
APPLYING_TRANSACTION_RETRIES_COUNT: 0
APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_NUMBER: 0
APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_MESSAGE:
APPLYING_TRANSACTION_LAST_TRANSIENT_ERROR_TIMESTAMP: 0000-00-00 00:00:00.000000
1 row in set (0.01 sec)
此时 需要怎么恢复主从的同步状态呢?两种方法。
1、重新 配置主从 change master ...
2、因为是在主库删除的数据,在已知情况下 可以跳过主库执行的这个事务。来恢复主从,
六、主从同步异常情况下,主库有数据写入
INSERT INTO `test`.`a` (`a`, `b`) VALUES (5, 11)
INSERT INTO `test`.`a` (`a`, `b`) VALUES (6, 22)
INSERT INTO `test`.`a` (`a`, `b`) VALUES (7, 33)
INSERT INTO `test`.`a` (`a`, `b`) VALUES (8, 44)
七、查看binlog日志
-- 主库执行
(root@localhost) [test]> show variables like 'log_bin';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin | ON |
+---------------+-------+
1 row in set (0.01 sec)
(root@localhost) [test]> show variables like 'log_bin%';
+---------------------------------+--------------------------------+
| Variable_name | Value |
+---------------------------------+--------------------------------+
| log_bin | ON |
| log_bin_basename | /var/lib/mysql/mysql-bin |
| log_bin_index | /var/lib/mysql/mysql-bin.index |
| log_bin_trust_function_creators | OFF |
| log_bin_use_v1_row_events | OFF |
+---------------------------------+--------------------------------+
5 rows in set (0.01 sec)
(root@localhost) [test]> show binlog events in 'mysql-bin.000163';
+------------------+------+----------------+-----------+-------------+-------------------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+------------------+------+----------------+-----------+-------------+-------------------------------------------+
| mysql-bin.000163 | 4 | Format_desc | 1 | 124 | Server ver: 8.0.16, Binlog ver: 4 |
| mysql-bin.000163 | 124 | Previous_gtids | 1 | 195 | d26ec757-adba-11ed-997d-000c297ca098:1-19 |
| mysql-bin.000163 | 195 | Anonymous_Gtid | 1 | 274 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| mysql-bin.000163 | 274 | Query | 1 | 349 | BEGIN |
| mysql-bin.000163 | 349 | Table_map | 1 | 397 | table_id: 86 (test.a) |
| mysql-bin.000163 | 397 | Write_rows | 1 | 441 | table_id: 86 flags: STMT_END_F |
| mysql-bin.000163 | 441 | Xid | 1 | 472 | COMMIT /* xid=32 */ |
| mysql-bin.000163 | 472 | Anonymous_Gtid | 1 | 551 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| mysql-bin.000163 | 551 | Query | 1 | 626 | BEGIN |
| mysql-bin.000163 | 626 | Table_map | 1 | 674 | table_id: 86 (test.a) |
| mysql-bin.000163 | 674 | Write_rows | 1 | 718 | table_id: 86 flags: STMT_END_F |
| mysql-bin.000163 | 718 | Xid | 1 | 749 | COMMIT /* xid=33 */ |
| mysql-bin.000163 | 749 | Anonymous_Gtid | 1 | 828 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| mysql-bin.000163 | 828 | Query | 1 | 903 | BEGIN |
| mysql-bin.000163 | 903 | Table_map | 1 | 951 | table_id: 86 (test.a) |
| mysql-bin.000163 | 951 | Delete_rows | 1 | 995 | table_id: 86 flags: STMT_END_F |
| mysql-bin.000163 | 995 | Xid | 1 | 1026 | COMMIT /* xid=34 */ |
| mysql-bin.000163 | 1026 | Anonymous_Gtid | 1 | 1105 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| mysql-bin.000163 | 1105 | Query | 1 | 1180 | BEGIN |
| mysql-bin.000163 | 1180 | Table_map | 1 | 1232 | table_id: 96 (test.test) |
| mysql-bin.000163 | 1232 | Delete_rows | 1 | 1336 | table_id: 96 flags: STMT_END_F |
| mysql-bin.000163 | 1336 | Xid | 1 | 1367 | COMMIT /* xid=35 */ |
| mysql-bin.000163 | 1367 | Anonymous_Gtid | 1 | 1446 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| mysql-bin.000163 | 1446 | Query | 1 | 1521 | BEGIN |
| mysql-bin.000163 | 1521 | Table_map | 1 | 1569 | table_id: 86 (test.a) |
| mysql-bin.000163 | 1569 | Write_rows | 1 | 1613 | table_id: 86 flags: STMT_END_F |
| mysql-bin.000163 | 1613 | Xid | 1 | 1644 | COMMIT /* xid=36 */ |
| mysql-bin.000163 | 1644 | Anonymous_Gtid | 1 | 1723 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| mysql-bin.000163 | 1723 | Query | 1 | 1798 | BEGIN |
| mysql-bin.000163 | 1798 | Table_map | 1 | 1846 | table_id: 86 (test.a) |
| mysql-bin.000163 | 1846 | Write_rows | 1 | 1890 | table_id: 86 flags: STMT_END_F |
| mysql-bin.000163 | 1890 | Xid | 1 | 1921 | COMMIT /* xid=37 */ |
| mysql-bin.000163 | 1921 | Anonymous_Gtid | 1 | 2000 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| mysql-bin.000163 | 2000 | Query | 1 | 2075 | BEGIN |
| mysql-bin.000163 | 2075 | Table_map | 1 | 2123 | table_id: 86 (test.a) |
| mysql-bin.000163 | 2123 | Write_rows | 1 | 2167 | table_id: 86 flags: STMT_END_F |
| mysql-bin.000163 | 2167 | Xid | 1 | 2198 | COMMIT /* xid=38 */ |
| mysql-bin.000163 | 2198 | Anonymous_Gtid | 1 | 2277 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| mysql-bin.000163 | 2277 | Query | 1 | 2352 | BEGIN |
| mysql-bin.000163 | 2352 | Table_map | 1 | 2400 | table_id: 86 (test.a) |
| mysql-bin.000163 | 2400 | Write_rows | 1 | 2444 | table_id: 86 flags: STMT_END_F |
| mysql-bin.000163 | 2444 | Xid | 1 | 2475 | COMMIT /* xid=39 */ |
+------------------+------+----------------+-----------+-------------+-------------------------------------------+
42 rows in set (0.00 sec)
分析binlog
[root@localhost mysql]# mysqlbinlog --no-defaults --base64-output=decode-rows -vvv --start-position=828 --stop-position=1446 mysql-bin.000163
可以在主库看到删除的两条数据 时间是:240125 15:18:24
八、查看从库binlog
(root@localhost) [test]> show binary logs;
+------------------+-----------+-----------+
| Log_name | File_size | Encrypted |
+------------------+-----------+-----------+
| mysql-bin.000023 | 178 | No |
| mysql-bin.000024 | 5706 | No |
| mysql-bin.000025 | 7946 | No |
| mysql-bin.000026 | 1331 | No |
+------------------+-----------+-----------+
(root@localhost) [test]> show binlog events in 'mysql-bin.000026' from 792;
+------------------+------+----------------+-----------+-------------+--------------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+------------------+------+----------------+-----------+-------------+--------------------------------------+
| mysql-bin.000026 | 792 | Query | 2 | 867 | BEGIN |
| mysql-bin.000026 | 867 | Table_map | 2 | 915 | table_id: 84 (test.a) |
| mysql-bin.000026 | 915 | Delete_rows | 2 | 959 | table_id: 84 flags: STMT_END_F |
| mysql-bin.000026 | 959 | Xid | 2 | 990 | COMMIT /* xid=26 */ |
| mysql-bin.000026 | 990 | Anonymous_Gtid | 2 | 1069 | SET @@SESSION.GTID_NEXT= 'ANONYMOUS' |
| mysql-bin.000026 | 1069 | Query | 2 | 1144 | BEGIN |
| mysql-bin.000026 | 1144 | Table_map | 2 | 1196 | table_id: 94 (test.test) |
| mysql-bin.000026 | 1196 | Delete_rows | 2 | 1300 | table_id: 94 flags: STMT_END_F |
| mysql-bin.000026 | 1300 | Xid | 2 | 1331 | COMMIT /* xid=27 */ |
+------------------+------+----------------+-----------+-------------+--------------------------------------+
分析
[root@localhost mysql]# mysqlbinlog --no-defaults --base64-output=decode-rows -vvv --start-position=792 --stop-position=1331 mysql-bin.000026
可以在从库看到删除的两条数据 时间是:240125 15:18:11
时间上 比主库操作提前几秒 240125 15:18:24(主库时间)
由此可以断定,是先删除的从库,在删除主库,而引起的主从同步异常
既然都需要删除这两条数据,也都已经执行了,可以在从库直接跳过这两个事务即可
可以先尝试跳过一个事务
(root@localhost) [test]> set global SQL_SLAVE_SKIP_COUNTER=1;
Query OK, 0 rows affected (0.00 sec)
(root@localhost) [test]>
(root@localhost) [test]> start slave sql_thread;
Query OK, 0 rows affected (0.01 sec)
在查询主从的状态
(root@localhost) [test]> show slave status \G
Last_Error: Could not execute Delete_rows event on table test.test; Can't find record in 'test', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's master log mysql-bin.000163, end_log_pos 1336
此时的报错信息 指向的是找不到test 表的这个数据,与跳过事务之前报错找不到表a 一致
再次跳过事务
(root@localhost) [test]> set global SQL_SLAVE_SKIP_COUNTER=1;
Query OK, 0 rows affected (0.00 sec)
(root@localhost) [test]>
(root@localhost) [test]>
(root@localhost) [test]> start slave sql_thread;
Query OK, 0 rows affected (0.00 sec)
(root@localhost) [test]> show slave status \G
Relay_Master_Log_File: mysql-bin.000163
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
主从同步正常
如果查看状态还有异常的话,在根据异常情况处理。