mysql semi sync_mysql semi-sync的演化

5.5引入semi-sync,当master事务提交后,由dump将对应binlog传给slaves,至少收到一个slave的ACK确认,master才返回给用户线程;

注意事项

1 slave ACK只代表io_thread已记录relay_log,并不意味着sql_thread已经执行;

2 master的事务commit后才传输给slave,如果此时master crash,会出现主备数据不一致;

3 dump thread既要负责传输binlog,又负责接收slave的ACK,且两者不能并行,效率很低;

4 dump thread读取binlog时获取LOCK_log,mutex期间任何线程不得对binlog进行读写;

为此后续版本不断改进

1  after_sync

5.7引入rpl_semi_sync_master_wait_point参数 ,DBA可选择master 在哪个阶段等待来自slave的ACK,要么按照以前的方法(after_commit),要么在master事务flush binlog之后但是commit storage engine之前;

AFTER_SYNC (the default): The master writes each transaction to its binary log and the slave, and syncs the binary log to disk. The master waits for slave acknowledgment of transaction receipt after the sync. Upon receiving acknowledgment, the master commits the transaction to the storage engine and returns a result to the client, which then can proceed.

AFTER_COMMIT: The master writes each transaction to its binary log and the slave, syncs the binary log, and commits the transaction to the storage engine. The master waits for slave acknowledgment of transaction receipt after the commit. Upon receiving acknowledgment, the master returns a result to the client, which then can proceed.

假定master上有两个客户端连接clienta和clientb,

clienta提交一个事务,pre-5.7 mysql将其依次写入redo,binlog和redo(commit),然后semi-sync,接收到slave ack后才能返回给clienta;

clientb便可在redo(commit)之后看到clienta提交的事务数据,这领先于clienta一步,从而造成连接间的数据不一致;

after_sync则避免了这种问题,clienta提交一个事务,mysql将其依次写入redo和binlog,然后semi-sync,等收到slave ack后才进行redo(commit),然后返回给clienta;

after_commit另外一个问题,若master在redo(commit)和semi-sync期间crash,此时主备数据并不一致;

after_sync至少能保证redo(commit)成功的事务都已同步到slave,比之改进了半步;

2 ack collector thread

5.7引入此独立线程,此时的dump thread只负责读取并发送binlog event,slave ACK的接收由ACK collector thread负责;

dump thread不必等待ack确认便可继续发送event,类似TCP的滑动窗口协议;

master维护一个semisync slave列表,即便ack thread宕掉,该列表仍然存在;

dump thread通过调用transmit_start时将slave注册到master,如果slave支持semisync则添加到semisync slave列表;

ack thread通过select()监听semisync slave列表;

Ack_receiver Class用于维护ACK线程

该线程有3种状态

enum status { ST_UP, ST_STOPPING, ST_DOWN };

ST_UP    means ack receive thread is created and is working.

ST_DOWN  means ack receive thread is destroyed.

ST_STOPPING means a user is disabling semisync master, and ack receive thread is being destroyed.

- m_slaves

A slave vector which includes slaves' useful information here.

DEFINITION:

Slave_vector m_slaves

- m_mutex

m_slaves and m_status are shared between user sessions(dump threads) and ack thread. So they should be protected by a mutex.

- add_slave()

Add a new semisync slave to slave list.

DEFINITION:

bool add_slave(THD *thd);

LOGIC:

initialze slave information.

acquire m_mutex

add the slave's information into m_slaves.

send a signal to ack receive thread. It may be waiting for a signal.

release m_mutex

- remove_slave()

remove a semisync slave from slave list.

DEFINITION:

void remove_slave(THD *thd)

LOGIC:

acquire m_mutex

remove thd of the slave from m_slaves.

release m_mutex

- run()

The handle function of receive thread.

DEFINITION:

void run();

LOGIC:

initialize pthread related things

while (1)

{

acquire m_mutex

if m_status is ST_STOPPING then break the loop.

wait any semisync slave to be added if slave list empty.

call select to listen on sockets, timeout is 1s.

restart and continue the loop if error or timeout happens.

receive and report acks to semisync master.

release m_mutex

}

de-initialize pthread related things

Note: Giving select a timeout makes other threads can add/remove slaves

or stop ack receive thread when there is no ack.

3 解除dump thread的LOCK_log mutex

当前dump线程的工作逻辑如下:

前台线程写binlog

acquire LOCK_log

write log event to binlog

release LOCK_log

signal update

dump线程

while client is not killed:

acquire LOCK_log

read event from binlog

release LOCK_log

if EOF was reached in the previous read:

acquire LOCK_log

wait for update signal

read event from binlog

release LOCK_log

当某个dump线程读取binlog时,它会获取LOCK_log mutex,期间会阻塞任何针对该binlog的读写请求;

移除LOCK_log

event只添加到当前binlog的尾部,所以读取其他部位的event不需要锁;

唯一的顾虑是当前台线程写binlog时,dump thread可能会读取到incomplete event;

为此MYSQL_BIN_LOG引入一个变量binlog_end_pos,记录当前binlog的last event的位置信息,dump thread只读取这之前的event;

write thread:添加完event后更新此变量,

read thread:只读取binlog_end_pos之前的event,

该变量由LOCK_binlog_end_pos保护,读写时均需要;

此时dump thread的逻辑如下

dump thread design:

end_position = 0

while client is not killed:

if current read position == end_position:

acquire lock_binlog_end

while end_position == binlog_end and client is not killed:

wait for update signal

release lock_binlog_end

if client is killed:

break

read event from binlog

http://dev.mysql.com/worklog/task/?id=5721#tabs-5721-5

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值