分布式数据迁移工具

最新推荐文章于 2024-05-12 09:35:53 发布

waj89757

最新推荐文章于 2024-05-12 09:35:53 发布

阅读量1.6k

点赞数

分类专栏： java 分布式文章标签：分布式迁移 mysql mycat

本文链接：https://blog.csdn.net/tracymm19891990/article/details/53983657

版权

java 同时被 2 个专栏收录

61 篇文章 0 订阅

订阅专栏

分布式

3 篇文章 0 订阅

订阅专栏

github：https://github.com/waj89757/db-migration

Data Migration Project

This project is a distributed migration service for mysql.It includes full data migration, incremental data migration and real-time data check.The service can process big-data migration efficiently,exactly.Our company use it in expansion and shrink of distributed database,disaster recovery and business service updating.

Flow Chart

flow-chart

System Frame

frame

Full data migration

1.Two way for Extract

Way 1: Extracting from master directly and controlling IO and Loading of master by configing the threshold.
Way 2: Copying a slave from master and extracting from slave.

2.Index

We rule the table upon 10 thousand rows or 100 mb must has index.Generally we extract data by primary key.

3.performance

We divide a schema into a lot of table tasks and every node process some of these.A master node controls the program of all slave nodes.
For one table data, We divide it into some chunks by index and run some threads to process every chunk.

Incrmental data migration

1.log the binlog position

Incre service starts by a position which is logged by full data task.

2.catch incre data

We enhanced canal which is a mysql binlog server.Incre service can subscribes binlog position and catches up data from canal by RPC.

3.performance

We choose one thread for one db task in order to the data sequence.We batch pull data and process data to increase performance.

Full data check

1.When to start?

Incre data service will notify check service starting when it keep migrationg and master in sync.

2.How to check full data?

First logging the binlog position.
Second extracting and chunking data by index.
Finally compareing and repairing data.

3.How to compare data?

Signaturing row data by MD5 and compare target and master.

Incre data check

1.When to start?

It start from binlog position when full data check finished.

2.How to compare data?

Subscribing target and master,Comparing binlog data in a certian range.
Incre data service log target and master binlog position and notify the incre data check to consume these.

Distribution and recovery

Migration Services use master-slave frame.The master node allocate schedules and transfer the fail task.
Slave node will report progress to master and master save progress.If slave fail,master will choose another node to continue task from the point of interruption.