达梦8 网络中断对系统的影响

测试环境:三节点实时主从

版本:--03134283938-20221019-172201-20018

测试1

系统没有启动确认监视器

关闭节点3网卡

登录节点1检查主库状态

显示向节点2发送归档成功,但无法收到节点3的消息,节点1挂起

日志报错如下:

2024-06-06 00:47:38.481 [INFO] database P0000002319 T0000000000000002373  Send archive log to remote instance failed, switch all ep to SUSPEND status success!
2024-06-06 00:47:48.482 [ERROR] database P0000002319 T0000000000000002356  Can't connect to DM server on '192.168.100.102' port(5800) errno(115)

恢复节点3网卡

主库日志信息如下:

2024-06-06 00:58:00.760 [INFO] database P0000002319 T0000000000000002356  mal_site_ctl_link_create startup from mal_site(0) to mal_site(2)!
2024-06-06 00:58:00.760 [INFO] database P0000002319 T0000000000000002356  mal_site_magic_gen site_magic[46500], src_site:0, dst_site:2
2024-06-06 00:58:00.761 [INFO] database P0000002319 T0000000000000002356  site[0] mal_site_ctl_port_set to site[2, IP: 192.168.100.102, port_num: 5800], socket handle = 12, site_magic = 46500
2024-06-06 00:58:00.761 [INFO] database P0000002319 T0000000000000002350  mal_site_port_get site_magic:46500, src_site:0, dst_site:2
2024-06-06 00:58:00.761 [INFO] database P0000002319 T0000000000000002349  mal_site_port_get site_magic:46500, src_site:0, dst_site:2
2024-06-06 00:58:00.768 [INFO] database P0000002319 T0000000000000002355  site[0] mal_site_data_port_set from site[2, IP: 192.168.100.102, port_num: 5800], socket handle = 14, site_magic = 46500
2024-06-06 00:58:00.769 [INFO] database P0000002319 T0000000000000002348  mal_site_port_get site_magic:46500, src_site:0, dst_site:2
2024-06-06 00:58:00.769 [INFO] database P0000002319 T0000000000000002351  mal_site_port_get site_magic:46500, src_site:0, dst_site:2

但检查主库状态依旧是suspend

重启(SHUTDOWN后被watcher自动拉起)数据库后再检查状态恢复正常

测试2

启动节点2上的确认监视器

中断节点3的网络

登录主库检查状态

虽然到TEST3发送归档失败,但主库状态正常

主库日志信息如下:

2024-06-06 01:07:44.807 [ERROR] database P0000002774 T0000000000000002819  [mal recv for arch] mal receive from site(TEST3) failed, begin lsn:622386010, end lsn:622386010, code:-6021
2024-06-06 01:07:44.807 [ERROR] database P0000002774 T0000000000000002819  send realtime archive to instance[TEST3] failed, code = -6021, begin_lsn = 622386010, end_lsn = 622386010!
2024-06-06 01:07:44.811 [INFO] database P0000002774 T0000000000000002819  Send archive log to remote instance failed, switch all ep to SUSPEND status success!
2024-06-06 01:07:46.268 [INFO] database P0000002774 T0000000000000002872  utsk_cmd_add, cmd info: cmd=217, dseq=1717631069, name_in=, begin_lsn=-1!
2024-06-06 01:07:46.268 [INFO] database P0000002774 T0000000000000002872  utsk_set_global_dw_stat, begin, msg_dseq:1717631069
2024-06-06 01:07:46.268 [INFO] database P0000002774 T0000000000000002872  set g_dw_stat from NONE to DW_FAILOVER success, g_dw_recover_stop is 0
2024-06-06 01:07:46.268 [INFO] database P0000002774 T0000000000000002872  utsk_set_global_dw_stat, finished, msg_dseq:1717631069, set code:0
2024-06-06 01:07:47.269 [INFO] database P0000002774 T0000000000000002872  utsk_cmd_add, cmd info: cmd=214, dseq=1717631070, name_in=, begin_lsn=-1!
2024-06-06 01:07:47.269 [INFO] database P0000002774 T0000000000000002832  utsk_cmd_exec, cmd:214, sys_status:SUSPEND, dseq:1717631070
2024-06-06 01:07:47.270 [INFO] database P0000002774 T0000000000000002832  Change TEST3 arch status from VALID to INVALID
2024-06-06 01:07:47.270 [INFO] database P0000002774 T0000000000000002872  utsk_cmd_add, received sql exec cmd:1, dseq:1717631071, sql:ALTER DATABASE OPEN FORCE

日志显示主库被挂起后立刻状态恢复为open

测试3

启动节点2上的确认监视器

中断节点2的网络

登录主库检查状态

网络恢复后节点2也变成了主,集群分裂

登录监视器显示如下:

集群分裂后只能重建

相关推荐

  1. Linux系统 DM8安装 数据库

    2024-06-08 05:22:02       8 阅读
  2. 系统资源耗尽服务器影响

    2024-06-08 05:22:02       22 阅读
  3. 系统资源耗尽服务器影响

    2024-06-08 05:22:02       18 阅读
  4. linux系统使用数据库

    2024-06-08 05:22:02       7 阅读
  5. linux系统登录数据库

    2024-06-08 05:22:02       8 阅读

最近更新

  1. TCP协议是安全的吗?

    2024-06-08 05:22:02       16 阅读
  2. 阿里云服务器执行yum,一直下载docker-ce-stable失败

    2024-06-08 05:22:02       16 阅读
  3. 【Python教程】压缩PDF文件大小

    2024-06-08 05:22:02       15 阅读
  4. 通过文章id递归查询所有评论(xml)

    2024-06-08 05:22:02       18 阅读

热门阅读

  1. 腾讯开源人像照片生成视频模型V-Express

    2024-06-08 05:22:02       9 阅读
  2. qgroundcontrol编程入门:探索无人机控制的新境界

    2024-06-08 05:22:02       11 阅读
  3. NLP基础知识讲解比较清楚的文章

    2024-06-08 05:22:02       6 阅读
  4. C++ 变量的声明和初始化方式

    2024-06-08 05:22:02       10 阅读
  5. Nginx介绍

    2024-06-08 05:22:02       7 阅读
  6. OCP学习笔记-007 SQL语言之一:DQL

    2024-06-08 05:22:02       9 阅读
  7. openresty lua修改响应体内容

    2024-06-08 05:22:02       10 阅读
  8. Always语句和assign的用法

    2024-06-08 05:22:02       10 阅读
  9. spring-boot 2.7.18整合sharding-jdbc-spring-boot-starter 4.1.1

    2024-06-08 05:22:02       12 阅读