gh-ost学习笔记 实验验证&&代码学习

gh-ost学习笔记

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-HVes9Sc2-1641550278470)(https://github.com/WQMysqlDBA/gh-ost/blob/master/doc/images/gh-ost-logo-light-160.png?raw=true)]

"开始吧"

gh-ost github开源的一个online ddl工具

某位业内不太知名,朋友圈内比较知名的同学,关于何时使用gh-ost,做了一个经典的描述

"100万行数据一下,,可以使用online ddl

超过百万,,可以使用gh-ost"

(。。。就这么几个字,他还写个错别字)

"了解一下mysql的ddl的几种方式"

1、MySQL 5.6 InnoDB 存储引擎在线DDL功能:

优点:
添加、删除字段或索引、添加主键、修改字段名字不会锁表。
局限性:
修改列数据类型 、删除表主键、修改字符集、添加全文索引会锁表
Online ddl是一个事务操作(从库延迟会严重)、取消会造成事务回滚影响数据库性能

2、pt-online-schema-change:

1)表不能存在其他触发器,否则报错
2)触发器存在锁竞争问题,对MySQL的性能造成比较大的影响,严重时甚至会拖垮数据库。
3)Rename可能出各种问题:阻塞原始表业务操作过久、丢数据、切换失败等
4)大表copy 造成内存热数据溢出
5)无法暂停:当主库业务负载开始增高时,你可能会想要暂停还没完成的修改表定义的任务。可是基于触发器的方案没办法这么做,只能取消操作,删除临时表、删除触发器。
6)当数据量增多、业务压力增大之后,就会碰到了越来越多的问题,有一些操作只敢在非业务低峰期才敢执行。

3、gh-ost有以下特点:

1)轻量,无触发器,因此对原表只copy,影响小
2)可以测试(在从库执行一遍)
3)提供暂停服务(业务忙了接着暂停,注意binlog清理期限)
4)伪装成从库,流方式读取变更的binlog,异步复制原理,应用到幽灵表

好处
gh-ost 承担一些其他工具留给数据库执行的任务,无触发器。
gh-ost可以更好地控制迁移过程;
可以真正暂停它;
可以真正将迁移的写入负载与主服务器的工作负载分离。
此外,它还提供了许多可操作的特权,使其更安全、可信赖且易于使用。
(异步,都是异步,只要binlog是全的,随便停止随便应用,就是玩)

gh-ost的工作流程

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-uE3uspac-1641550278472)(https://github.com/WQMysqlDBA/gh-ost/blob/master/doc/images/gh-ost-general-flow.png?raw=true)]

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-xdcPxxEF-1641550278473)(https://github.com/WQMysqlDBA/gh-ost/blob/master/doc/images/gh-ost-operation-modes.png?raw=true)]

(深颜色表示Priamry节点,浅颜色表示Secondary节点)

gh-ost有三种工作模式

1、连接从库,在主库转换 b

2、连接主库,在主库转换 a

3、在从库上测试和转换 c

  • 1、连接从库,在主库转换 (default)

    这是gh-ost的默认模式
    1、gh-ost 将会检查从库状态,找到集群结构中的主库并连接,接下来进行迁移操作
    2、在主库上创建 _xxx_gho(和原表表结构一致)、_xxx_ghc(记录变更日志),并修改 _xxx_gho 表结构
    例如:
    _sbtest1_ghc   变更日志,即binlog应用的日志
    _sbtest1_gho   幽灵表
    sbtest1        原表
    3、业务正常进行,行数据在主库上读写,读取从库的二进制日志,将变更应用到主库的幽灵表上
    4、在从库收集表格式,字段&索引,行数等信息
    5、在从库上读取内部的变更事件(如心跳事件)
    6、在主库切换表
    ./gh-ost --max-load=Threads_running=20 \
    --critical-load=Threads_running=200 \
    --critical-load-interval-millis=5000 \
    --chunk-size=10000  \
    --user="root" \
    --password="xxxx" \
    --host='xxxxxxxxxxxx' --port=3306 \
    --database="test" --table="sbtest1" \
    --verbose --debug --stack  \
    --alter="add key idx_datetime(datetime)" \
    --initially-drop-ghost-table \
    --initially-drop-old-table \
    --execute 2>&1  | tee  rebuild_t1.log
    
  • 2、连接主库,在主库转换

    需要使用 --allow-on-master 选项:
    在主库上创建 _xxx_gho、_xxx_ghc,并修改 _xxx_gho 表结构;
    在主库上读源表的数据写入 _xxx_gho 表中;
    从主库上读取二进制日志事件,将变更应用到主库上的 _xxx_gho 表;
    在主库上完成表切换。
    如果你没有从库,或者不想使用从库,你可以直接在主库上操作。
    gh-ost 将会直接在主库上进行所有操作。你需要持续关注复制延迟问题。
    你的主库的二进制日志必须是 RBR 格式。
    在这个模式中你必须指定 --allow-on-master 参数
    
    ./gh-ost --max-load=Threads_running=20 \
    --critical-load=Threads_running=200 \
    --critical-load-interval-millis=5000 \
    --chunk-size=10000  \
    --user="root" --password="taotao" \
    --host=127.0.0.1 --port=3306 \
    --database="test" --table="sbtest1" \
    --verbose --debug --stack 
    --allow-on-master \   
    --alter="add key idx_createtime(createtime) " 
    --initially-drop-ghost-table --initially-drop-old-table \
    --execute 2>&1  | tee  rebuild_t1.log
    
  • 3、在从库进行测试和转换

    这种模式会在从库上做修改。gh-ost 仍然会连上主库,但所有操作都是在从库上做的,不会对主库产生任何影响。
    在操作过程中,gh-ost 也会不时地暂停,以便从库的数据可以保持最新。gh-ost 将控制速度保证从库可以及时的进行数据同步
    --migrate-on-replica 选项让 gh-ost 直接在从库上修改表。最终的切换过程也是在从库正常复制的状态下完成的。
    --migrate-on-replica 表示 gh-ost 会直接在从库上进行迁移操作。即使在复制运行阶段也可以进行表的切换操作
    --test-on-replica 表明操作只是为了测试目的。在进行最终的切换操作之前,复制会被停止。原始表和临时表会相互切换,再切换回来,最终相当于原始表没被动过。主从复制暂停的状态下,你可以检查和对比这两张表中的数据。
    --test-on-replica 表示 迁移操作只是为了测试在切换之前复制会停止,然后会进行切换操作,然后在切换回来,你的原始表最终还是原始表。两个表都会保存下来,复制操作是停止的。你可以对这两个表进行一致性检查等测试操作。
    
    gh-ost限制条件
    违反限制条件gh-ost会报错,所以工具还是比较安全的。
    1.做的过程会间隙的锁部分记录(记录数chunk-size参数定义),:因为copy数据使用了 insert ..select from ... where pk >? and pk< ? lock in share mode
    2.Master-master setup is only supported in active-passive setup. Active-active (where table is being written to on both masters concurrently) is unsupported. It may be supported in the future.
       这个原因估计时gh-ost代码实现使用了显式加锁的方式。
    3.The two before & after tables must share a PRIMARY KEY or other UNIQUE KEY
    4.Foreign keys not supported
    5.Triggers are not supported.
    

暂停操作:

#暂停
echo throttle | socat - /tmp/gh-ost.test.t1.sock
#恢复
echo no-throttle | socat - /tmp/gh-ost.test.t1.sock

修改限速参数:

echo chunk-size=100 | socat - /tmp/gh-ost.t1.sock

echo max-lag-millis=200 | socat - /tmp/gh-ost.t1.sock

echo max-load=Thread_running=3 | socat - /tmp/gh-ost.t1.sock
2022-01-07 11:53:33 INFO starting gh-ost 1.1.2
2022-01-07 11:53:33 INFO Migrating `qdump`.`sbtest2`
2022-01-07 11:53:33 INFO inspector connection validated on 10.244.0.43:3306
2022-01-07 11:53:33 INFO User has ALL privileges
2022-01-07 11:53:33 INFO binary logs validated on 10.244.0.43:3306
2022-01-07 11:53:33 INFO Inspector initiated on gh-ost-f4d1c00-0:3306, version 5.7.26-log
2022-01-07 11:53:33 INFO Table found. Engine=InnoDB
2022-01-07 11:53:33 DEBUG Estimated number of rows via STATUS: 1000
2022-01-07 11:53:33 DEBUG Validated no foreign keys exist on table
2022-01-07 11:53:33 DEBUG Validated no triggers exist on table
2022-01-07 11:53:33 INFO Estimated number of rows via EXPLAIN: 1000
2022-01-07 11:53:33 DEBUG Potential unique keys in sbtest2: [PRIMARY (auto_increment): [id]; has nullable: false]
2022-01-07 11:53:33 INFO Recursively searching for replication master
2022-01-07 11:53:33 DEBUG Looking for master on 10.244.0.43:3306
2022-01-07 11:53:33 INFO Master found to be gh-ost-f4d1c00-0:3306
2022-01-07 11:53:33 INFO log_slave_updates validated on 10.244.0.43:3306
2022-01-07 11:53:33 INFO streamer connection validated on 10.244.0.43:3306
2022-01-07 11:53:33 DEBUG Streamer binlog coordinates: mysql-bin.000030:486171970
2022-01-07 11:53:33 INFO Connecting binlog streamer at mysql-bin.000030:486171970
[2022/01/07 11:53:33] [info] binlogsyncer.go:133 create BinlogSyncer with config {99999 mysql 10.244.0.43 3306 gh-ost    false false <nil> false UTC true 0 0s 0s 0 false}
[2022/01/07 11:53:33] [info] binlogsyncer.go:354 begin to sync binlog from position (mysql-bin.000030, 486171970)
[2022/01/07 11:53:33] [info] binlogsyncer.go:203 register slave for master server 10.244.0.43:3306
2022-01-07 11:53:33 DEBUG Beginning streaming
2022-01-07 11:53:33 INFO rotate to next log from mysql-bin.000030:0 to mysql-bin.000030
[2022/01/07 11:53:33] [info] binlogsyncer.go:723 rotate to (mysql-bin.000030, 486171970)
2022-01-07 11:53:33 INFO applier connection validated on 10.244.0.43:3306
2022-01-07 11:53:33 INFO applier connection validated on 10.244.0.43:3306
2022-01-07 11:53:33 INFO will use time_zone='+08:00' on applier
2022-01-07 11:53:33 INFO Examining table structure on applier
2022-01-07 11:53:33 INFO Applier initiated on gh-ost-f4d1c00-0:3306, version 5.7.26-log
2022-01-07 11:53:33 INFO Dropping table `qdump`.`_sbtest2_gho`
2022-01-07 11:53:33 INFO Table dropped
2022-01-07 11:53:33 INFO Dropping table `qdump`.`_sbtest2_del`
2022-01-07 11:53:33 INFO Table dropped
2022-01-07 11:53:33 INFO Dropping table `qdump`.`_sbtest2_ghc`
2022-01-07 11:53:33 INFO Table dropped
2022-01-07 11:53:33 INFO Creating changelog table `qdump`.`_sbtest2_ghc`
2022-01-07 11:53:33 INFO Changelog table created
2022-01-07 11:53:33 INFO Creating ghost table `qdump`.`_sbtest2_gho`
2022-01-07 11:53:33 INFO Ghost table created
2022-01-07 11:53:33 INFO Altering ghost table `qdump`.`_sbtest2_gho`
2022-01-07 11:53:33 DEBUG ALTER statement: alter /* gh-ost */ table `qdump`.`_sbtest2_gho` ADD COLUMN o2 varchar(10)
2022-01-07 11:53:33 INFO Ghost table altered
2022-01-07 11:53:33 INFO Altering ghost table AUTO_INCREMENT value `qdump`.`_sbtest2_gho`
2022-01-07 11:53:33 DEBUG AUTO_INCREMENT ALTER statement: alter /* gh-ost */ table `qdump`.`_sbtest2_gho` AUTO_INCREMENT=1000001
2022-01-07 11:53:34 INFO Ghost table AUTO_INCREMENT altered
2022-01-07 11:53:34 INFO Intercepted changelog state GhostTableMigrated
2022-01-07 11:53:34 INFO Waiting for ghost table to be migrated. Current lag is 0s
2022-01-07 11:53:34 DEBUG ghost table migrated
2022-01-07 11:53:34 INFO Handled changelog state GhostTableMigrated
2022-01-07 11:53:34 DEBUG Potential unique keys in _sbtest2_gho: [PRIMARY (auto_increment): [id]; has nullable: false]
2022-01-07 11:53:34 INFO Chosen shared unique key is PRIMARY
2022-01-07 11:53:34 INFO Shared columns are id,k,c,pad
2022-01-07 11:53:34 INFO Listening on unix socket file: /tmp/gh-ost.qdump.sbtest1.sock
2022-01-07 11:53:34 DEBUG Reading migration range according to key: PRIMARY
2022-01-07 11:53:34 INFO Migration min values: [1]
2022-01-07 11:53:34 DEBUG Reading migration range according to key: PRIMARY
2022-01-07 11:53:34 INFO Migration max values: [1000]
2022-01-07 11:53:34 INFO Waiting for first throttle metrics to be collected
2022-01-07 11:53:34 INFO First throttle metrics collected
2022-01-07 11:53:34 DEBUG Operating until row copy is complete
2022-01-07 11:53:34 DEBUG Getting nothing in the write queue. Sleeping...
# Migrating `qdump`.`sbtest2`; Ghost table is `qdump`.`_sbtest2_gho`
# Migrating gh-ost-f4d1c00-0:3306; inspecting gh-ost-f4d1c00-0:3306; executing on 10-10-30-17
# Migration started at Fri Jan 07 11:53:33 +0800 2022
# chunk-size: 10000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=25; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle.qdump.sbtest1
# panic-flag-file: /tmp/gh-ost.panic.qdump.sbtest1.flag
# Serving on unix socket: /tmp/gh-ost.qdump.sbtest1.sock
Copy: 0/1000 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysql-bin.000030:486175394; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: N/A
Copy: 0/1000 0.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: mysql-bin.000030:486176939; Lag: 0.01s, HeartbeatLag: 0.06s, State: migrating; ETA: N/A
2022-01-07 11:53:35 DEBUG Issued INSERT on range: [1]..[1000]; iteration: 0; chunk-size: 10000
2022-01-07 11:53:35 DEBUG Iteration complete: no further range to iterate
2022-01-07 11:53:35 DEBUG Getting nothing in the write queue. Sleeping...
2022-01-07 11:53:35 INFO Row copy complete
Copy: 1000/1000 100.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: mysql-bin.000030:486211234; Lag: 0.01s, HeartbeatLag: 0.08s, State: migrating; ETA: due
2022-01-07 11:53:35 DEBUG checking for cut-over postpone
2022-01-07 11:53:35 DEBUG checking for cut-over postpone: complete
2022-01-07 11:53:35 INFO Grabbing voluntary lock: gh-ost.77807.lock
2022-01-07 11:53:35 INFO Setting LOCK timeout as 2 seconds
2022-01-07 11:53:35 INFO Looking for magic cut-over table
2022-01-07 11:53:35 INFO Creating magic cut-over table `qdump`.`_sbtest2_del`
2022-01-07 11:53:35 INFO Magic cut-over table created
2022-01-07 11:53:35 INFO Locking `qdump`.`sbtest2`, `qdump`.`_sbtest2_del`
2022-01-07 11:53:35 INFO Tables locked
2022-01-07 11:53:35 INFO Session locking original & magic tables is 77807  ## 锁原表和majic表
2022-01-07 11:53:35 INFO Writing changelog state: AllEventsUpToLockProcessed:1641527615099148655
2022-01-07 11:53:35 INFO Intercepted changelog state AllEventsUpToLockProcessed
2022-01-07 11:53:35 INFO Handled changelog state AllEventsUpToLockProcessed
2022-01-07 11:53:35 INFO Waiting for events up to lock
Copy: 1000/1000 100.0%; Applied: 0; Backlog: 1/1000; Time: 2s(total), 1s(copy); streamer: mysql-bin.000030:486256890; Lag: 0.01s, HeartbeatLag: 0.06s, State: migrating; ETA: due
2022-01-07 11:53:36 INFO Waiting for events up to lock: got AllEventsUpToLockProcessed:1641527615099148655
2022-01-07 11:53:36 INFO Done waiting for events up to lock; duration=988.095699ms
# Migrating `qdump`.`sbtest2`; Ghost table is `qdump`.`_sbtest2_gho`
# Migrating gh-ost-f4d1c00-0:3306; inspecting gh-ost-f4d1c00-0:3306; executing on 10-10-30-17
# Migration started at Fri Jan 07 11:53:33 +0800 2022
2022-01-07 11:53:36 DEBUG Getting nothing in the write queue. Sleeping...
# chunk-size: 10000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=25; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle.qdump.sbtest1
# panic-flag-file: /tmp/gh-ost.panic.qdump.sbtest1.flag
# Serving on unix socket: /tmp/gh-ost.qdump.sbtest1.sock
Copy: 1000/1000 100.0%; Applied: 0; Backlog: 0/1000; Time: 2s(total), 1s(copy); streamer: mysql-bin.000030:486257800; Lag: 0.01s, HeartbeatLag: 0.08s, State: migrating; ETA: due
2022-01-07 11:53:36 INFO Setting RENAME timeout as 1 seconds
2022-01-07 11:53:36 INFO Session renaming tables is 77812
2022-01-07 11:53:36 INFO Issuing and expecting this to block: rename /* gh-ost */ table `qdump`.`sbtest2` to `qdump`.`_sbtest2_del`, `qdump`.`_sbtest2_gho` to `qdump`.`sbtest2`      ## 
2022-01-07 11:53:36 INFO Found atomic RENAME to be blocking, as expected. Double checking the lock is still in place (though I don't strictly have to)
2022-01-07 11:53:36 INFO Checking session lock: gh-ost.77807.lock
2022-01-07 11:53:36 INFO Connection holding lock on original table still exists
2022-01-07 11:53:36 INFO Will now proceed to drop magic table and unlock tables
2022-01-07 11:53:36 INFO Dropping magic cut-over table
2022-01-07 11:53:36 INFO Releasing lock from `qdump`.`sbtest2`, `qdump`.`_sbtest2_del`
2022-01-07 11:53:36 INFO Tables unlocked
2022-01-07 11:53:36 INFO Tables renamed
2022-01-07 11:53:36 INFO Lock & rename duration: 1.022541989s. During this time, queries on `sbtest2` were blocked
[2022/01/07 11:53:36] [info] binlogsyncer.go:164 syncer is closing...
[2022/01/07 11:53:36] [error] binlogsyncer.go:631 connection was bad
[2022/01/07 11:53:36] [error] binlogstreamer.go:77 close sync with err: Sync was closed
2022-01-07 11:53:36 DEBUG Done streaming
2022-01-07 11:53:36 INFO Closed streamer connection. err=<nil>
2022-01-07 11:53:36 INFO Dropping table `qdump`.`_sbtest2_ghc`
[2022/01/07 11:53:36] [info] binlogsyncer.go:179 syncer is closed
2022-01-07 11:53:36 INFO Table dropped
2022-01-07 11:53:36 INFO Am not dropping old table because I want this operation to be as live as possible. If you insist I should do it, please add `--ok-to-drop-table` next time. But I prefer you do not. To drop the old table, issue:
2022-01-07 11:53:36 INFO -- drop table `qdump`.`_sbtest2_del`
2022-01-07 11:53:36 INFO Done migrating `qdump`.`sbtest2`
2022-01-07 11:53:36 INFO Removing socket file: /tmp/gh-ost.qdump.sbtest1.sock
2022-01-07 11:53:36 INFO Tearing down inspector
2022-01-07 11:53:36 INFO Tearing down applier
2022-01-07 11:53:36 DEBUG Tearing down...
2022-01-07 11:53:36 INFO Tearing down streamer
2022-01-07 11:53:36 INFO Tearing down throttler
2022-01-07 11:53:36 DEBUG Tearing down...





2022-01-07T11:55:33.764366+08:00        77850 Query     select @@global.version
2022-01-07T11:55:33.764812+08:00        77850 Query     select @@global.port
2022-01-07T11:55:33.765143+08:00        77847 Query     select @@global.time_zone
2022-01-07T11:55:33.765459+08:00        77847 Query     select @@global.hostname, @@global.port
2022-01-07T11:55:33.765754+08:00        77847 Query     show columns from `qdump`.`sbtest2`
2022-01-07T11:55:33.766630+08:00        77847 Query     drop /* gh-ost */ table if exists `qdump`.`_sbtest2_gho`
2022-01-07T11:55:33.768356+08:00        77847 Query     show /* gh-ost */ table status from `qdump` like '_sbtest2_gho'
2022-01-07T11:55:33.769046+08:00        77847 Query     drop /* gh-ost */ table if exists `qdump`.`_sbtest2_del`
2022-01-07T11:55:33.774425+08:00        77847 Query     show /* gh-ost */ table status from `qdump` like '_sbtest2_del'
2022-01-07T11:55:33.775069+08:00        77847 Query     drop /* gh-ost */ table if exists `qdump`.`_sbtest2_ghc`
2022-01-07T11:55:33.777390+08:00        77847 Query     create /* gh-ost */ table `qdump`.`_sbtest2_ghc` (
                        id bigint auto_increment,
                        last_update timestamp not null DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
                        hint varchar(64) charset ascii not null,
                        value varchar(4096) charset ascii not null,
                        primary key(id),
                        unique key hint_uidx(hint)
                ) auto_increment=256
2022-01-07T11:55:33.785513+08:00        77847 Query     create /* gh-ost */ table `qdump`.`_sbtest2_gho` like `qdump`.`sbtest2`
2022-01-07T11:55:33.793596+08:00        77847 Query     alter /* gh-ost */ table `qdump`.`_sbtest2_gho` ADD COLUMN o1 varchar(10)
2022-01-07T11:55:33.806418+08:00        77847 Query     alter /* gh-ost */ table `qdump`.`_sbtest2_gho` AUTO_INCREMENT=1000001
2022-01-07T11:55:33.810057+08:00        77847 Query     insert /* gh-ost */ into `qdump`.`_sbtest2_ghc`



###################此处为cut-over的主要流程###################
2022-01-07T11:55:34.965562+08:00        77847 Query     START TRANSACTION
2022-01-07T11:55:34.965982+08:00        77847 Query     select connection_id()
2022-01-07T11:55:34.966593+08:00        77847 Query     select get_lock('gh-ost.77847.lock', 0) ## 确认是否get lock
2022-01-07T11:55:34.967119+08:00        77847 Query     set session lock_wait_timeout:=2 rename timeout*2的锁定时间


2022-01-07T11:55:34.967650+08:00        77852 Query     show /* gh-ost */ table status from `qdump` like '_sbtest2_del'
2022-01-07T11:55:34.968462+08:00        77852 Query     create /* gh-ost */ table `qdump`.`_sbtest2_del` (
                        id int auto_increment primary key
                ) engine=InnoDB comment='ghost-cut-over-sentry'


2022-01-07T11:55:34.975440+08:00        77847 Query     lock /* gh-ost */ tables `qdump`.`sbtest2` write, `qdump`.`_sbtest2_del` write      ------加锁保证原表不再产生event  此时任何对于表的修改都会被锁 有rename timeout*2的锁定时间 (代码里设计的)

间隔1s

2022-01-07T11:55:35.935563+08:00        77853 Connect   gh-ost@10.244.0.32 on qdump using TCP/IP
2022-01-07T11:55:35.935869+08:00        77853 Query     SET autocommit=true
2022-01-07T11:55:35.936087+08:00        77853 Query     SET NAMES utf8mb4
2022-01-07T11:55:35.936404+08:00        77853 Query     select id

2022-01-07T11:55:35.933865+08:00        77852 Query     START TRANSACTION
2022-01-07T11:55:35.934154+08:00        77852 Query     select connection_id()
2022-01-07T11:55:35.934490+08:00        77852 Query     set session lock_wait_timeout:=1
2022-01-07T11:55:35.934848+08:00        77852 Query     rename /* gh-ost */ table `qdump`.`sbtest2` to `qdump`.`_sbtest2_del`, `qdump`.`_sbtest2_gho` to `qdump`.`sbtest2`       ------这个是原子做操作 不受事务影响  此时是被卡住的

22-01-07T11:55:35.937625+08:00        77853 Query     select is_used_lock('gh-ost.77847.lock')
## 77853算是个中间人
## 立刻查看,当rename出现在了processlist中,立刻释放锁
2022-01-07T11:55:35.938004+08:00        77847 Query     drop /* gh-ost */ table if exists `qdump`.`_sbtest2_del`
2022-01-07T11:55:35.951323+08:00        77847 Query     unlock tables
此时rename 被完成
2022-01-07T11:55:35.951627+08:00        77847 Query     ROLLBACK  --
2022-01-07T11:55:35.975045+08:00        77852 Query     ROLLBACK  --  
2022-01-07T11:55:36.076006+08:00        77853 Query     drop /* gh-ost */ table if exists `qdump`.`_sbtest2_ghc`
###################此处为cut-over的主要流程###################

cut-over
23session配合操作

session1创建majic table
session1 锁住原表和majic table 只要完成了lock table 和创建majic表,此时就无所谓了,后面的rename操作由其他session配合完成
session 2 来确认session1是否持有锁 是-->观察processlist中是否出现rename
被阻止RENAME的总是优先于被阻止的INSERT/UPDATE/DELETE,无论谁先来
出现的话立刻释放session1的锁
让session3 rename成功 完成rename 原表--> del , gho幽灵表 --> 原表
(其实是两个goroutine ,goroutine2 里发起了AtomicCutoverRename,此函数发起session3)
成功 -- > 皆大欢喜
失败 -- > 对原表不会有任何影响,可以再来一次

问题1: session1对表加锁之后,在rename之前的这段时间有dml试图获取锁,rename发起锁请求的时间晚?是谁先得到锁?

rename相比dml,rename总是会先得到锁,与mysql获得锁的机制有关

代码:

Migrate 执行完整的迁移逻辑。 这是*主要的 gh-ost 功能。

migrator := logic.NewMigrator(migrationContext)
migrator.Migrate()

cutover

提供了两种方式 atomic|twosteps

if this.migrationContext.CutOverType == base.CutOverAtomic {
   // Atomic solution: we use low timeout and multiple attempts. But for
   // each failed attempt, we throttle until replication lag is back to normal
   // 原子解决方案:我们使用低超时和多次尝试。
   // 但是对于每次失败的尝试,我们都会进行节流,直到复制延迟恢复正常
   err := this.atomicCutOver()
   this.handleCutOverResult(err)
   return err
}
if this.migrationContext.CutOverType == base.CutOverTwoStep {
   err := this.cutOverTwoStep()
   this.handleCutOverResult(err)
   return err
}
return this.migrationContext.Log.Fatalf("Unknown cut-over type: %d; should never get here!", this.migrationContext.CutOverType)

atomic


// atomicCutOver
func (this *Migrator) atomicCutOver() (err error) {
	atomic.StoreInt64(&this.migrationContext.InCutOverCriticalSectionFlag, 1)
	defer atomic.StoreInt64(&this.migrationContext.InCutOverCriticalSectionFlag, 0)

	okToUnlockTable := make(chan bool, 4)
	var dropCutOverSentryTableOnce sync.Once

	//sync.Once 是 Golang package 中使方法只执行一次的对象实现,作用与 init 函数类似。但也有所不同。
	//init 函数是在文件包首次被加载的时候执行,且只执行一次
	//sync.Onc 是在代码运行中需要的时候执行,且只执行一次

	defer func() {
		okToUnlockTable <- true
		dropCutOverSentryTableOnce.Do(func() {
			this.applier.DropAtomicCutOverSentryTableIfExists()
		})
	}()

	atomic.StoreInt64(&this.migrationContext.AllEventsUpToLockProcessedInjectedFlag, 0)

	lockOriginalSessionIdChan := make(chan int64, 2)
	tableLocked := make(chan error, 2)
	tableUnlocked := make(chan error, 2)

	// 起一个goutine 假如称为session1
	// 负责 set session lock_wait_timeout=migrationContext.CutOverLockTimeoutSeconds * 2
	// 创建majic table CreateAtomicCutOverSentryTable()
	// lock /* gh-ost */ tables 原表 write , majic table write ?是否是原子语句?
	// 之后这个的主要工作就完成了 等待 okToUnlockTable 信号让他删除majic table 和 释放lock
	// <-okToUnlockTable ( okToUnlockTable <-chan bool 一个只读的channel )
	go func() {
		if err := this.applier.AtomicCutOverMagicLock(lockOriginalSessionIdChan, tableLocked, okToUnlockTable, tableUnlocked, &dropCutOverSentryTableOnce); err != nil {
			this.migrationContext.Log.Errore(err)
		}
	}()

	if err := <-tableLocked; err != nil {
		return this.migrationContext.Log.Errore(err)
	}
	lockOriginalSessionId := <-lockOriginalSessionIdChan
	this.migrationContext.Log.Infof("Session locking original & magic tables is %+v", lockOriginalSessionId)
	// At this point we know the original table is locked.  此时我们知道原始表被锁定。
	// We know any newly incoming DML on original table is blocked. 我们知道原始表上任何新传入的 DML 都被阻止。
	if err := this.waitForEventsUpToLock(); err != nil {
		return this.migrationContext.Log.Errore(err)
	}

	// Step 2
	// We now attempt an atomic RENAME on original & ghost tables, and expect it to block.
	// 我们现在尝试在原始表和幽灵表上进行原子重命名,并期望它阻塞。
	this.migrationContext.RenameTablesStartTime = time.Now()

	var tableRenameKnownToHaveFailed int64
	renameSessionIdChan := make(chan int64, 2)
	tablesRenamed := make(chan error, 2)

	// 起一个goroutine
	// 如果错误了直接unlock table,对主表无影响
	// 如果没有出错 --即实现了这两个go routine的通信,renameSessionIdChan被更新成发起rename的session的id,tablesRenamed为nil
	go func() {
		if err := this.applier.AtomicCutoverRename(renameSessionIdChan, tablesRenamed); err != nil {
			// Abort! Release the lock
			atomic.StoreInt64(&tableRenameKnownToHaveFailed, 1)
			okToUnlockTable <- true
		}
	}()
	renameSessionId := <-renameSessionIdChan
	this.migrationContext.Log.Infof("Session renaming tables is %+v", renameSessionId)

	waitForRename := func() error {
		if atomic.LoadInt64(&tableRenameKnownToHaveFailed) == 1 {
			// We return `nil` here so as to avoid the `retry`. The RENAME has failed,
			// it won't show up in PROCESSLIST, no point in waiting
			// 我们在这里返回 `nil` 以避免 `retry`。 RENAME 失败,它不会出现在 PROCESSLIST 中,无需等待
			// 如果失败了,那就是失败了 返回空,下方的retryOperation就会失败 okToUnlockTable
			// 被阻止RENAME的总是优先于被阻止的INSERT/UPDATE/DELETE,无论谁先来
			return nil
		}
		// 正常返回renameSessionId
		return this.applier.ExpectProcess(renameSessionId, "metadata lock", "rename")
	}
	// Wait for the RENAME to appear in PROCESSLIST
	// 等待 RENAME 出现在 PROCESSLIST 中 ,立刻释放锁
	if err := this.retryOperation(waitForRename, true); err != nil {
		// Abort! Release the lock
		okToUnlockTable <- true
		return err
	}
	// rename成功
	if atomic.LoadInt64(&tableRenameKnownToHaveFailed) == 0 {
		this.migrationContext.Log.Infof("Found atomic RENAME to be blocking, as expected. Double checking the lock is still in place (though I don't strictly have to)")
	}
	if err := this.applier.ExpectUsedLock(lockOriginalSessionId); err != nil {
		// Abort operation. Just make sure to drop the magic table. 中止操作。 只要确保删除majic table
		return this.migrationContext.Log.Errore(err)
	}
	this.migrationContext.Log.Infof("Connection holding lock on original table still exists")

	// Now that we've found the RENAME blocking, AND the locking connection still alive, 现在我们已经发现 RENAME 阻塞,并且锁定连接仍然存在,
	// we know it is safe to proceed to release the lock  我们知道继续释放锁是安全的

	okToUnlockTable <- true
	// BAM! magic table dropped, original table lock is released   BAM! 魔术表被删除,原来的表锁被释放
	// -> RENAME released -> queries on original are unblocked.    重命名已发布 -> 原始查询已解除阻止。
	if err := <-tableUnlocked; err != nil {
		return this.migrationContext.Log.Errore(err)
	}
	if err := <-tablesRenamed; err != nil {
		return this.migrationContext.Log.Errore(err)
	}
	this.migrationContext.RenameTablesEndTime = time.Now()

	// ooh nice! We're actually truly and thankfully done          不错哦! 我们真的很幸运地完成了
	lockAndRenameDuration := this.migrationContext.RenameTablesEndTime.Sub(this.migrationContext.LockTablesStartTime)
	this.migrationContext.Log.Infof("Lock & rename duration: %s. During this time, queries on %s were blocked", lockAndRenameDuration, sql.EscapeName(this.migrationContext.OriginalTableName))
	return nil
}

cutOverTwoStep

// cutOverTwoStep will lock down the original table, execute
// what's left of last DML entries, and **non-atomically** swap original->old, then new->original.
// There is a point in time where the "original" table does not exist and queries are non-blocked
// and failing.
// cutOverTwoStep 纯在的问题
// cutOverTwoStep 将锁定原始表,执行最后一个 DML 条目的剩余部分,并**非原子地**交换原始->旧,然后新->原始。有一个时间点“原始”表没有 存在且查询未阻塞且失败。
func (this *Migrator) cutOverTwoStep() (err error) {
	atomic.StoreInt64(&this.migrationContext.InCutOverCriticalSectionFlag, 1)
	defer atomic.StoreInt64(&this.migrationContext.InCutOverCriticalSectionFlag, 0)
	atomic.StoreInt64(&this.migrationContext.AllEventsUpToLockProcessedInjectedFlag, 0)

	if err := this.retryOperation(this.applier.LockOriginalTable); err != nil {
		return err
	}

	if err := this.retryOperation(this.waitForEventsUpToLock); err != nil {
		return err
	}
	if err := this.retryOperation(this.applier.SwapTablesQuickAndBumpy); err != nil {
		return err
	}
	if err := this.retryOperation(this.applier.UnlockTables); err != nil {
		return err
	}

	lockAndRenameDuration := this.migrationContext.RenameTablesEndTime.Sub(this.migrationContext.LockTablesStartTime)
	renameDuration := this.migrationContext.RenameTablesEndTime.Sub(this.migrationContext.RenameTablesStartTime)
	this.migrationContext.Log.Debugf("Lock & rename duration: %s (rename only: %s). During this time, queries on %s were locked or failing", lockAndRenameDuration, renameDuration, sql.EscapeName(this.migrationContext.OriginalTableName))
	return nil
}

参考

https://www.cnblogs.com/zhoujinyi/p/9187421.html
https://github.com/github/gh-ost/issues/82
[https://dev.mysql.com/doc/refman/5.7/en/lock-tables.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值