现有TiDB集群扩展pump/drainer作为binlog文件落地

最新推荐文章于 2024-08-18 10:51:32 发布

菩提老鹰

最新推荐文章于 2024-08-18 10:51:32 发布

阅读量947

点赞数 1

分类专栏： TiDB 全栈运维文章标签： tidb binlog ticdc tiup 全栈运维

本文链接：https://blog.csdn.net/eagle5063/article/details/126390918

版权

全栈运维同时被 2 个专栏收录

20 篇文章 1 订阅

订阅专栏

TiDB

2 篇文章 0 订阅

订阅专栏

在这里插入图片描述

tiup工具的使用

目前建议使用tiup 对TiDB集群进行管理和维护， tiup工具对集群的操作命令整理如下

列举当前所有TiDB集群

tiup cluster list

查看具体的集群状态

tiup cluster display neibu-tidb

查看具体的集群配置

tiup cluster show-config neibu-tidb

编辑具体的集群配置

tiup cluster edit-config neibu-tidb

编辑集群配置之后需要reload

tiup cluster reload neibu-tidb

集群扩容

tiup cluster scale-out neibu-tidb scale-out-binlog.yaml -uroot -p

binlog 架构

tidb-binlog-cluster-architecture

具体的架构说明参考： get-started-with-tidb-binlog

扩容binlog组件

因为本次扩容的目的是把TiDB的changelogs落地为binlog文件，而非同步到下游的MySQL或者Kafka，所以不用提前准备下游环境

1、准备tidb用户的免密登录

因为tiup 工具是部署在独立的主机节点下的 tidb 用户下，所有部署之前一定要保证 tiup主机节点和其他要部署组件的主机节点之间打通ssh免密登录

有ansible编写的playbook，直接修改对应的host，执行playbook即可

ansible-play /etc/ansible/mission/tidb-user.yml

2、新增扩容文件 scale-out-binlog.yaml

这里deploy/data/log 三个目录接口和集群中保持统一，可以通过 tiup cluster show-config neibu-tidb 来查看其他组件的配置

pump_servers:
 - host: 192.168.3.106
   ssh_port: 22
   port: 8250
   deploy_dir: /opt/app/tidb-deploy/pump-8250
   data_dir: /data/tidb-data/pump-8250
   log_dir: /opt/app/tidb-deploy/pump-8250/log
   arch: amd64
   os: linux
drainer_servers:
 - host: 192.168.3.107
   ssh_port: 22
   port: 8249
   deploy_dir: /opt/app/tidb-deploy/drainer-8249
   data_dir: /data/tidb-data/drainer-8249
   log_dir: /opt/app/tidb-deploy/drainer-8249/log
   arch: amd64
   os: linux
   config:
     syncer.db-type: "file"

3、扩容前查看集群状态
4、进行扩容

tiup cluster scale-out neibu-tidb scale-out-binlog.yaml -uroot -p

tiup-cluster-scale-out

4、扩容完毕，查看集群状态

发现多了 pump节点和drainer节点

tiup-cluster-dispaly-after-scale-out

开启Binlog

登录数据库查看binlog开启状态

MySQL [(none)]> show variables like 'log_bin';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin       | OFF   |
+---------------+-------+
1 row in set (0.02 sec)

使用tiup cluster edit-config 来开启binlog

tiup cluster edit-config neibu-tidb

server_configs:
  tidb:
    binlog.enable: true
    binlog.ignore-error: true

然后reload使配置生效

tiup cluster reload neibu-tidb

最后再次检查binlog 是否开启并查看 pump节点和drainer节点状态

MySQL [(none)]> show variables like 'log_bin';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin       | ON    |
+---------------+-------+
1 row in set (0.04 sec)
MySQL [(none)]> show pump status;
+--------------------+--------------------+--------+--------------------+---------------------+
| NodeID             | Address            | State  | Max_Commit_Ts      | Update_Time         |
+--------------------+--------------------+--------+--------------------+---------------------+
| 192.168.3.106:8250 | 192.168.3.106:8250 | online | 435347930332266498 | 2022-08-17 15:16:01 |
+--------------------+--------------------+--------+--------------------+---------------------+
1 row in set (0.00 sec)
MySQL [(none)]> show drainer status;
+--------------------+--------------------+--------+--------------------+---------------------+
| NodeID             | Address            | State  | Max_Commit_Ts      | Update_Time         |
+--------------------+--------------------+--------+--------------------+---------------------+
| 192.168.3.107:8249 | 192.168.3.107:8249 | online | 435347623626670082 | 2022-08-17 14:56:29 |
+--------------------+--------------------+--------+--------------------+---------------------+
1 rows in set (0.00 sec)

检查binlog落地

这个时候分别查看 pump 和 drainer的日志

1、pump的日志中有

[2022/08/17 14:27:45.752 +08:00] [INFO] [server.go:562] ["server info tick"] [writeBinlogCount=182] [alivePullerCount=1] [MaxCommitTS=435347171074375681]

注意这里的writeBinlogCount=182 就是代表有binlog 在写入, 0的话就是pump没有获取到binlog

2、drainer的日志

[2022/08/17 15:03:02.684 +08:00] [INFO] [syncer.go:260] ["write save point"] [ts=435347726701690882] [version=2737]
[2022/08/17 15:03:06.708 +08:00] [INFO] [syncer.go:260] ["write save point"] [ts=435347727750266881] [version=2737]
[2022/08/17 15:03:09.726 +08:00] [INFO] [syncer.go:260] ["write save point"] [ts=435347728549543937] [version=2737]

这里需要注意 version=2737，如果是 version=0 的代表 drainer 没有从pump获取到binlog

3、检查 drainer 的数据目录，发现有 binlog-0000000000001511-20220817115052 就是落地的binlog文件

tidb 中binlog的格式有text和 json 两种，默认是text

缩容binlog组件pump/drainer

官网给的伸缩容案例中无 binlog组件 pump/drainer 的缩容案例

进过多次试验得出缩容binlog组件pump/drainer的有效步骤

1、编辑集群配置文件，设置binlog.enable 为 false

2、执行scale-in 先缩容 drainer 节点

3、执行scale-in 先缩容 pump 节点

这里有个问题：
缩容 pump节点之后， display展示结果中 pump节点的状态依然是 UP 但是在最后有提示

There are some nodes can be pruned:
	Nodes: [192.168.3.106:8250 192.168.3.107:8249]
	You can destroy them with the command: `tiup cluster prune neibu-tidb`

4、所以需要根据提示执行如下命令

tiup cluster prune neibu-tidb

最后出现 Destroy success 代表缩容完成

5、再次display查看集群状态，显示已经没有 pump 和 drainer

但是存在个问题，

1）这个时候登录tidb，查看 pump 状态依旧是 online

MySQL [(none)]> show pump status;
+--------------------+--------------------+--------+--------------------+---------------------+
| NodeID             | Address            | State  | Max_Commit_Ts      | Update_Time         |
+--------------------+--------------------+--------+--------------------+---------------------+
| 192.168.3.106:8250 | 192.168.3.106:8250 | online | 435347930332266498 | 2022-08-17 15:16:01 |
+--------------------+--------------------+--------+--------------------+---------------------+
1 row in set (0.00 sec)

然后在 tidb-server 的日志文件中会看到如下报错

2）此时是没有pump/drainer 节点的，但是如果开启了 binlog 会在 tidb server 日志，每隔30s 有如下报错

[2022/08/17 15:30:00.603 +08:00] [WARN] [client.go:294] ["[pumps client] write binlog to pump failed"] [NodeID=192.168.3.106:8250] ["binlog type"=Prewrite] ["start ts"=435348150832332802] ["commit ts"=0] [length=62088] [error="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 192.168.3.106:8250: connect: connection refused\""]

这两个问题虽然不影响使用，但是会有错误日志产生，或者信息误导。

大家有解决方案的，欢迎交流。

❤️ 欢迎关注我的公众号，一起学习新知识！一起进步！

————————————————————————————————————————————————————
公众号：全栈运维
个人博客: http://blog.colinspace.com
知乎：https://www.zhihu.com/people/colin-31-49/posts
CSDN ：https://blog.csdn.net/eagle5063
简书：https://www.jianshu.com/u/6d793fbacc88
————————————————————————————————————————————————————