MaxWell技术实践V1.0


1 MAXWELL简述

Maxwell是一个能实时读取MySQL二进制日志binlog,并生成 JSON 格式的消息,作为生产者发送给 Kafka,Kinesis、RabbitMQ、Redis、Google Cloud Pub/Sub、文件或其它平台的应用程序。它的常见应用场景有ETL、维护缓存、收集表级别的dml指标、增量到搜索引擎、数据分区迁移、切库binlog回滚方案等。
官方网站:http://maxwells-daemon.io/quickstart/

1.1 MAXWELL工作原理

Maxwell的工作原理也是伪装成mysql的slave,然后利用mysql的主从复制机制。这点与Canal类似,这里不做赘述,详见《Canal技术实践V1.0.docx》1.1章节

1.2 MAXWELL的特点

Maxwell的主要特点:
1.能够实时的监控 Mysql 数据的变化
2.直接保存为json格式
3.可以配置过滤database、table、column和黑名单

1.3 MAXWELL应用场景

  1. 实时监控并同步Mysql中数据的变化。

2 MAXWELL部署

2.1 中间件版本选取

中间件名称版本号
CentOSCentOS 6.8
mysql5.6.24
kafka0.11.0.2
Maxwell1.27.0

2.2 环境准备

本次技术实践安装Maxwell集群,单独安装在hadoop102主机上,后续HA另做规划

2.2.1 CentOS6.8

CentOS6.8安过程省略。预先创建用户/用户组zhouchen
预先安装jdk1.8.0_92 +
预先安装mysql

2.2.2 关闭防火墙-root

针对CentOS7以下
1.查看防火墙状态
service iptables status
2.停止防火墙
service iptables stop
3.启动防火墙
service iptables start

2.3 集群安装

2.3.1 Mysql配置

1.赋权限

mysql>  GRANT ALL on maxwell.* to 'maxwell'@'%' identified by 'maxwell';
Query OK, 0 rows affected (0.01 sec)

mysql> GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE on *.* to 'maxwell'@'%';
Query OK, 0 rows affected (0.00 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

2.开启 binlog
打开文件my.cnf, 如果没有就创建一个

[zhouchen@hadoop102 /]$ sudo find ./ -iname my.cnf
./usr/my.cnf

添加如下配置:

[zhouchen@hadoop102 /]$ sudo vim /usr/my.cnf
[mysqld]
server-id= 1
log-bin= mysql-bin #开启bin-log
binlog_format= row #选择row模式

3.重启 mysql 使 binlog 生效

[zhouchen@hadoop102 /]$ sudo service mysql restart

4.检查 binlog 是否生效
1)进入 mysql

mysql> show variables like 'log_%';
+----------------------------------------+--------------------------------+
| Variable_name                          | Value                          |
+----------------------------------------+--------------------------------+
| log_bin                                | ON                             |
| log_bin_basename                       | /var/lib/mysql/mysql-bin       |
| log_bin_index                          | /var/lib/mysql/mysql-bin.index |
| log_bin_trust_function_creators        | OFF                            |
| log_bin_use_v1_row_events              | OFF                            |
| log_error                              | /var/lib/mysql/hadoop201.err   |
| log_output                             | FILE                           |
| log_queries_not_using_indexes          | OFF                            |
| log_slave_updates                      | OFF                            |
| log_slow_admin_statements              | OFF                            |
| log_slow_slave_statements              | OFF                            |
| log_throttle_queries_not_using_indexes | 0                              |
| log_warnings                           | 1                              |
+----------------------------------------+--------------------------------+

*log_bin:on 表示开启成功
2)/var/lib/mysql下的mysql-bin文件
在这里插入图片描述

2.3.2 Maxwell安装

1.下载Maxwell

[zhouchen@hadoop102 software]$ wget https://github.com/zendesk/maxwell/releases/download/v1.27.0/maxwell-1.27.0.tar.gz

2.解压

[zhouchen@hadoop102 software]$ tar -zxvf maxwell-1.27.0.tar.gz -C /opt/module/

3.修改Maxwell通用配置config.properties

[zhouchen@hadoop102 maxwell-1.27.0]$ vim config.properties
# tl;dr config
log_level=info

# mysql login info
host=hadoop102
user=maxwell
password=maxwell

2.3.3 集群启动

启动:

[zhouchen@hadoop102 maxwell-1.27.0]$ bin/maxwell --user='maxwell' --password='maxwell' --host='hadoop102' --producer=stdout
Using kafka version: 1.0.0
15:20:33,541 WARN  MaxwellMetrics - Metrics will not be exposed: metricsReportingType not configured.
15:20:33,831 INFO  SchemaStoreSchema - Creating maxwell database
15:20:33,912 INFO  Maxwell - Maxwell v1.27.0 is booting (StdoutProducer), starting at Position[BinlogPosition[mysql-bin.000070:4928], lastHeartbeat=0]
15:20:33,998 INFO  AbstractSchemaStore - Maxwell is capturing initial schema
15:20:34,209 INFO  BinlogConnectorReplicator - Setting initial binlog pos to: mysql-bin.000070:4928
15:20:34,220 INFO  BinaryLogClient - Connected to hadoop102:3306 at mysql-bin.000070/4928 (sid:6379, cid:25)
15:20:34,220 INFO  BinlogConnectorReplicator - Binlog connected. 

2.3.4 安装检查

1.查看进程

[zhouchen@hadoop102 ~]$ jps
6466 Maxwell
6573 Jps 

3 MAXWELL基本操作

3.1 测试MAXWELL正常接收BINLOG

1.启动Maxwell
详见2.2.3

2.向mysql中插入数据

mysql> insert into test_maxwell values(4,6,'jack',4);
Query OK, 1 row affected (0.00 sec)
mysql> insert into test_maxwell values(3,5,'jinb',5);
Query OK, 1 row affected (0.01 sec)

3.检查Maxwell前台进程实时打印日志(可见打印的binlog是json格式)

[zhouchen@hadoop102 maxwell-1.27.0]$ bin/maxwell --user='maxwell' --password='maxwell' --host='hadoop102' --producer=stdout
Using kafka version: 1.0.0
15:20:33,541 WARN  MaxwellMetrics - Metrics will not be exposed: metricsReportingType not configured.
15:20:33,831 INFO  SchemaStoreSchema - Creating maxwell database
15:20:33,912 INFO  Maxwell - Maxwell v1.27.0 is booting (StdoutProducer), starting at Position[BinlogPosition[mysql-bin.000070:4928], lastHeartbeat=0]
15:20:33,998 INFO  AbstractSchemaStore - Maxwell is capturing initial schema
15:20:34,209 INFO  BinlogConnectorReplicator - Setting initial binlog pos to: mysql-bin.000070:4928
15:20:34,220 INFO  BinaryLogClient - Connected to hadoop102:3306 at mysql-bin.000070/4928 (sid:6379, cid:25)
15:20:34,220 INFO  BinlogConnectorReplicator - Binlog connected.
{"database":"test","table":"test_maxwell","type":"insert","ts":1595402774,"xid":1207,"commit":true,"data":{"age":4,"tall":6,"name":"jack","record":4}}
{"database":"test","table":"test_maxwell","type":"insert","ts":1595402775,"xid":1210,"commit":true,"data":{"age":3,"tall":5,"name":"jinb","record":5}}

3.2 MAXWELL实时向KAFKA同步数据

3.2.1 配置kafka信息

[zhouchen@hadoop102 maxwell-1.27.0]$ vim config.properties
producer=kafka 
kafka.bootstrap.servers=hadoop102:9092,hadoop103:9092,hadoop104:9092 #kafka集群
kafka_topic=maxwell #写入Kafka的topic,需要提前创建
kafka.retries=1
kafka.acks=all
kinesis_stream=maxwell

3.2.2 将kafka-client拷贝到maxwell/lib目录下

[zhouchen@hadoop102 kafka-clients]$ cp /opt/module/kafka/libs/kafka-clients-0.11.0.2.jar /opt/module/maxwell-1.27.0/lib/kafka-clients/

3.2.3 启动kafka并创建topic maxwell

[zhouchen@hadoop102 kafka]$ bin/kafka-topics.sh --create --zookeeper hadoop202:2181 --replication-factor 1 --partitions 3 –topic maxwell

3.2.4 启动Kafka消费maxwell topic

[zhouchen@hadoop102 kafka]$ bin/kafka-console-consumer.sh --bootstrap-server hadoop102:9092 --topic maxwell --from-beginning

3.2.5 启动maxwell

[zhouchen@hadoop102 maxwell-1.27.0]$ bin/maxwell --user='maxwell' --password='maxwell' --host='hadoop102'  --producer=kafka --kafka_version=0.11.0.2 --kafka.bootstrap.servers=hadoop102:9092,hadoop103:9092,hadoop104:9092 --kafka_topic=maxwell
Using kafka version: 0.11.0.2
16:01:39,871 WARN  MaxwellMetrics - Metrics will not be exposed: metricsReportingType not configured.
16:01:40,194 INFO  ProducerConfig - ProducerConfig values: 
	acks = 1
	batch.size = 16384
	bootstrap.servers = [hadoop102:9092, hadoop103:9092, hadoop104:9092]
	buffer.memory = 33554432
	client.id = 
	compression.type = snappy
	connections.max.idle.ms = 540000
	enable.idempotence = false
	interceptor.classes = null
	key.serializer = class org.apache.kafka.common.serialization.StringSerializer
	linger.ms = 0
	max.block.ms = 60000
	max.in.flight.requests.per.connection = 5
	max.request.size = 1048576
	metadata.max.age.ms = 300000
	metric.reporters = []
	metrics.num.samples = 2
	metrics.recording.level = INFO
	metrics.sample.window.ms = 30000
	partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
	receive.buffer.bytes = 32768
	reconnect.backoff.max.ms = 1000
	reconnect.backoff.ms = 50
	request.timeout.ms = 30000
	retries = 0
	retry.backoff.ms = 100
	sasl.jaas.config = null
	sasl.kerberos.kinit.cmd = /usr/bin/kinit
	sasl.kerberos.min.time.before.relogin = 60000
	sasl.kerberos.service.name = null
	sasl.kerberos.ticket.renew.jitter = 0.05
	sasl.kerberos.ticket.renew.window.factor = 0.8
	sasl.mechanism = GSSAPI
	security.protocol = PLAINTEXT
	send.buffer.bytes = 131072
	ssl.cipher.suites = null
	ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
	ssl.endpoint.identification.algorithm = null
	ssl.key.password = null
	ssl.keymanager.algorithm = SunX509
	ssl.keystore.location = null
	ssl.keystore.password = null
	ssl.keystore.type = JKS
	ssl.protocol = TLS
	ssl.provider = null
	ssl.secure.random.implementation = null
	ssl.trustmanager.algorithm = PKIX
	ssl.truststore.location = null
	ssl.truststore.password = null
	ssl.truststore.type = JKS
	transaction.timeout.ms = 60000
	transactional.id = null
	value.serializer = class org.apache.kafka.common.serialization.StringSerializer

16:01:40,224 INFO  AppInfoParser - Kafka version : 0.11.0.2
16:01:40,224 INFO  AppInfoParser - Kafka commitId : 73be1e1168f91ee2
16:01:40,239 INFO  Maxwell - Maxwell v1.27.0 is booting (MaxwellKafkaProducer), starting at Position[BinlogPosition[mysql-bin.000070:166226], lastHeartbeat=1595404689401]
16:01:40,370 INFO  MysqlSavedSchema - Restoring schema id 1 (last modified at Position[BinlogPosition[mysql-bin.000070:4928], lastHeartbeat=0])
16:01:40,427 INFO  BinlogConnectorReplicator - Setting initial binlog pos to: mysql-bin.000070:166226
16:01:40,439 INFO  BinaryLogClient - Connected to hadoop102:3306 at mysql-bin.000070/166226 (sid:6379, cid:41)
16:01:40,439 INFO  BinlogConnectorReplicator - Binlog connected

3.2.6 向Mysql插入数据

mysql> insert into test_maxwell values(1,1,'jim',2);
mysql> insert into test_maxwell values(2,1,'grace',2);

mysql> insert into test_maxwell values(3,5,'jinb',5);
mysql> insert into test_maxwell values(4,6,'jack',4);

mysql> insert into test_maxwell values(18,198,'lily',18);
mysql> insert into test_maxwell values(19,187,'green',36);

3.2.7 kafka消费到插入的数据

注意:
1.此时Maxwell前台进程并不会打印实时同步的binlog
2.消费的maxwell topic中的消息直接就是json格式的

[zhouchen@hadoop102 kafka]$ bin/kafka-console-consumer.sh --bootstrap-server hadoop102:9092 -
{"database":"test","table":"test_maxwell","type":"insert","ts":1595404919,"xid":6248,"commit":true,"data":{"age":1,"tall":1,"name":"jim","record":2}}
{"database":"test","table":"test_maxwell","type":"insert","ts":1595404919,"xid":6249,"commit":true,"data":{"age":2,"tall":1,"name":"grace","record":2}}
{"database":"test","table":"test_maxwell","type":"insert","ts":1595404919,"xid":6250,"commit":true,"data":{"age":3,"tall":5,"name":"jinb","record":5}}
{"database":"test","table":"test_maxwell","type":"insert","ts":1595404919,"xid":6251,"commit":true,"data":{"age":4,"tall":6,"name":"jack","record":4}}
{"database":"test","table":"test_maxwell","type":"insert","ts":1595404919,"xid":6252,"commit":true,"data":{"age":18,"tall":198,"name":"lily","record":18}}
{"database":"test","table":"test_maxwell","type":"insert","ts":1595404920,"xid":6256,"commit":true,"data":{"age":19,"tall":187,"name":"green","record":36}}

3.3 MAXWELL的断点续传

maxwell默认提供断点续传功能
1.先kill maxwell进程

09:47:14,409 INFO  TaskManager - Stopping 3 tasks
09:47:14,409 INFO  TaskManager - Stopping: com.zendesk.maxwell.schema.PositionStoreThread@7c690249
09:47:14,409 INFO  TaskManager - Stopping: com.zendesk.maxwell.bootstrap.BootstrapController@72aee3b4
09:47:14,409 INFO  PositionStoreThread - Storing final position: Position[BinlogPosition[mysql-bin.000071:1407], lastHeartbeat=1595468834305]
09:47:14,409 INFO  TaskManager - Stopping: com.zendesk.maxwell.replication.BinlogConnectorReplicator@76916939
09:47:14,722 INFO  TaskManager - Stopped all tasks

2.再到数据库中修改几条数据:

mysql> update test.test_maxwell set age = 20 where name = 'grace';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

3.再次启动maxwell进程,修改的数据仍可以接收到

[zhouchen@hadoop102 maxwell-1.27.0]$ bin/maxwell --user='maxwell' --password='maxwell' --host='hadoop102' --producer=stdout
Using kafka version: 1.0.0
09:50:37,462 WARN  MaxwellMetrics - Metrics will not be exposed: metricsReportingType not configured.
09:50:37,940 INFO  Maxwell - Maxwell v1.27.0 is booting (StdoutProducer), starting at Position[BinlogPosition[mysql-bin.000071:1407], lastHeartbeat=1595468834305]
09:50:38,064 INFO  MysqlSavedSchema - Restoring schema id 1 (last modified at Position[BinlogPosition[mysql-bin.000070:4928], lastHeartbeat=0])
09:50:38,403 INFO  BinlogConnectorReplicator - Setting initial binlog pos to: mysql-bin.000071:1407
09:50:38,414 INFO  BinaryLogClient - Connected to hadoop102:3306 at mysql-bin.000071/1407 (sid:6379, cid:33)
09:50:38,414 INFO  BinlogConnectorReplicator - Binlog connected.
{"database":"test","table":"test_maxwell","type":"update","ts":1595468985,"xid":180,"commit":true,"data":{"age":20,"tall":1,"name":"grace","record":2},"old":{"age":2}}

3.4 MAXWELL的过滤配置

Maxwell 可以通过 --filter 配置项来指定过滤规则,通过 exclude 排除,通过 include 包含,值可以为具体的数据库、数据表、数据列,甚至用 Javascript 来定义复杂的过滤规则;可以用正则表达式描述,有几个来自官网的例子
#仅匹配foodb数据库的tbl表和所有table_数字的表

--filter='exclude: foodb.*, include: foodb.tbl, include: foodb./table_\d+/'

#排除所有库所有表,仅匹配db1数据库

--filter = 'exclude: *.*, include: db1.*'

#排除含db.tbl.col列值为reject的所有更新

--filter = 'exclude: db.tbl.col = reject'

#排除任何包含col_a列的更新

--filter = 'exclude: *.*.col_a = *'

#blacklist 黑名单,完全排除bad_db数据库,若要恢复,必须删除maxwell库

--filter = 'blacklist: bad_db.*' 
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值