第三章 Maxwell 入门案例

3.1 监控 Mysql 数据并在控制台打印

1)实现步骤:

(1)运行 maxwell 来监控 mysql 数据更新

[root@hdp101 maxwell]# bin/maxwell --user='maxwell' \
--password='maxwell' \
--host='hdp103' \
--producer=stdout

(2)向 mysql 的 test_maxwell 库的 test 表插入一条数据,查看 maxwell 的控制台输出

JSON在线格式化

mysql> insert into test values(1,'aaa');

{
	"database": "test_maxwell", --库名
	"table": "test", --表名
	"type": "insert", --数据更新类型
	"ts": 1653568451, --操作时间
	"xid": 1109, --操作 id
	"commit": true, --提交成功
	"data": { --数据
		"id": 1,
		"name": "aaa"
	}
}

(3)向 mysql 的 test_maxwell 库的 test 表同时插入 3 条数据,控制台出现了 3 条 json日志,说明 maxwell 是以数据行为单位进行日志的采集的。

mysql> INSERT INTO test VALUES(2,'bbb'),(3,'ccc'),(4,'ddd');

{"database":"test_maxwell","table":"test","type":"insert","ts":1653568679,"xid":1637,"xoffset":0,"data":{"id":2,"name":"bbb"}}
{"database":"test_maxwell","table":"test","type":"insert","ts":1653568679,"xid":1637,"xoffset":1,"data":{"id":3,"name":"ccc"}}
{"database":"test_maxwell","table":"test","type":"insert","ts":1653568679,"xid":1637,"commit":true,"data":{"id":4,"name":"ddd"}}

mysql> update test set name='zaijian' where id =1;

{
	"database": "test_maxwell",
	"table": "test",
	"type": "update",
	"ts": 1653568713,
	"xid": 1732,
	"commit": true,
	"data": {
		"id": 1,
		"name": "zaijian"
	},
	"old": {
		"name": "aaa"
	}
}

(4)修改 test_maxwell 库的 test 表的一条数据,查看 maxwell 的控制台输出

mysql> update test set name='abc' where id =1;

{
	"database": "test_maxwell",
	"table": "test",
	"type": "update",
	"ts": 1653568766,
	"xid": 1858,
	"commit": true,
	"data": { --修改后的数据
		"id": 1,
		"name": "abc"
	},
	"old": { --修改前的数据
		"name": "zaijian"
	}
}

(5)删除 test_maxwell 库的 test 表的一条数据,查看 maxwell 的控制台输出

mysql> DELETE FROM test WHERE id =1;

{
	"database": "test_maxwell",
	"table": "test",
	"type": "delete",
	"ts": 1653568862,
	"xid": 2078,
	"commit": true,
	"data": {
		"id": 1,
		"name": "abc"
	}
}

3.2 监控 Mysql 数据输出到 kafka

1、实现步骤:

(1)启动 zookeeper 和 kafka

[root@hdp101 maxwell]# jpsall 
--------------------- hdp101节点 ---------------------
6662 Kafka
6007 QuorumPeerMain
--------------------- hdp102节点 ---------------------
28597 Kafka
28189 QuorumPeerMain
--------------------- hdp103节点 ---------------------
30555 QuorumPeerMain
30957 Kafka

(2)启动 Maxwell 监控 binlog

[root@hdp101 maxwell]# bin/maxwell --user='maxwell' \
--password='maxwell' \
--host='hdp103' \
--producer=kafka \
--kafka.bootstrap.servers=hdp101:9092 --kafka_topic=maxwell

(3)打开 kafka 的控制台的消费者消费 maxwell 主题

[atguigu@hadoop102 ~]$ kafka-console-consumer.sh --bootstrap-server hdp101:9092 --topic maxwell

(4)向 test_maxwell 库的 test 表再次插入一条数据

mysql> insert into test values (5,'eee');

(5)通过 kafka 消费者来查看到了数据,说明数据成功传入 kafka

{"database":"test_maxwell","table":"test","type":"insert","ts":1653569183,"xid":2558,"commit":true,"data":{"id":5,"name":"eee"}}

2、kafka 主题数据的分区控制

在公司生产环境中,我们一般都会用 maxwell 监控多个 mysql 库的数据,然后将这些数据发往 kafka 的一个主题 Topic,并且这个主题也肯定是多分区的,为了提高并发度。那么如何控制这些数据的分区问题,就变得至关重要,实现步骤如下:

(1)修改 maxwell 的配置文件,定制化启动 maxwell 进程

[root@hdp101 maxwell]# vim config.properties

# tl;dr config
log_level=info

producer=kafka
	kafka.bootstrap.servers=hdp101:9092,hdp102:9092,hdp103:9092

# mysql login info
host=hdp103
user=maxwell
password=maxwell

# *** kafka ***
# list of kafka brokers
#kafka.bootstrap.servers=hosta:9092,hostb:9092
# kafka topic to write to
# this can be static, e.g. 'maxwell', or dynamic, e.g.
namespace_%{database}_%{table}
# in the latter case 'database' and 'table' will be replaced with the values for the row being processed
kafka_topic=maxwell3

# *** partitioning ***
# What part of the data do we partition by?
#producer_partition_by=database # [database, table, primary_key, transaction_id, column]
producer_partition_by=database

# 控制数据分区模式,可选模式有 库名,表名,主键,列名
# specify what fields to partition by when using
producer_partition_by=column
# column separated list.
#producer_partition_columns=name

# when using producer_partition_by=column, partition by this when
# the specified column(s) don't exist.
#producer_partition_by_fallback=database

(2)手动创建一个 3 个分区的 topic,名字就叫做 maxwell3

[root@hdp101 maxwell]# kafka-topics.sh --zookeeper hdp101:2181,hdp102:2181,hdp103:2181/kafka \
--create --replication-factor 2 \
--partitions 3 --topic maxwell3

(3)利用配置文件启动 Maxwell 进程

[root@hdp101 maxwell]# bin/maxwell -- config ./config.properties

(4)向 test_maxwell 库的 test 表再次插入一条数据

mysql> insert into test_maxwell.test values (6,'fff');

(5)通过 kafka tool 工具查看,此条数据进入了 maxwell3 主题的 1 号分区

kafka工具

(6)向 test 库的 aaa 表插入一条数据

mysql> insert into test values (23,'dd');

(7)通过 kafka tool 工具查看,此条数据进入了 maxwell3 主题的 0 号分区,说明库名会对数据进入的分区造成影响。

3.3 监控 Mysql 指定表数据输出控制台

(1)运行 maxwell 来监控 mysql 指定表数据更新

[root@hdp101 ~]# bin/maxwell --user='maxwell' \
--password='maxwell' \
--host='hdp103' \
--filter 'exclude: *.*, include:test_maxwell.test' \
--producer=stdout

(2)向 test_maxwell.test 表插入一条数据,查看 maxwell 的监控

mysql> insert into test_maxwell.test values(7,'ggg');

{"database":"test_maxwell","table":"test","type":"insert","ts":1653570323,"xid":5034,"commit":true,"data":{"id":7,"name":"ggg"}}

(3)向 test_maxwell.test2 表插入一条数据,查看 maxwell 的监控

mysql> insert into test1 values(1,'nihao');

本次没有收到任何信息
说明 include 参数生效,只能监控指定的 mysql 表的信息

注:还可以设置 include:test_maxwell.*,通过此种方式来监控 mysql 某个库的所有表,也就是说过滤整个库。读者可以自行测试。

3.4 监控 Mysql 指定表全量数据输出控制台,数据初始化

Maxwell 进程默认只能监控 mysql 的 binlog 日志的新增及变化的数据,但是Maxwell 是支持数据初始化的,可以通过修改 Maxwell 的元数据,来对 MySQL 的某张表进行数据初始化,也就是我们常说的全量同步。具体操作步骤如下:

需求:将 test_maxwell 库下的 test2 表的四条数据,全量导入到 maxwell 控制台进行打印。

(1)修改 Maxwell 的元数据,触发数据初始化机制,在 mysql 的 maxwell 库中 bootstrap 表中插入一条数据,写明需要全量数据的库名和表名。

mysql> insert into maxwell.bootstrap(database_name,table_name)
values('test_maxwell','test2');

(2)启动 maxwell 进程,此时初始化程序会直接打印 test2 表的所有数据

[root@hdp101 ~]# bin/maxwell --user='maxwell' \
--password='maxwell' \
--host='hdp103' \
--producer=stdout

Using kafka version: 1.0.0
21:10:28,342 INFO  Maxwell - Starting Maxwell. maxMemory: 466288640 bufferMemoryUsage: 0.25
21:10:28,793 INFO  Maxwell - Maxwell v1.29.2 is booting (StdoutProducer), starting at Position[BinlogPosition[mysql-bin.000003:3730], lastHeartbeat=0]
21:10:29,599 INFO  MysqlSavedSchema - Restoring schema id 1 (last modified at Position[BinlogPosition[mysql-bin.000003:583], lastHeartbeat=0])
21:10:30,207 INFO  BinlogConnectorReplicator - Setting initial binlog pos to: mysql-bin.000003:3730
21:10:30,301 INFO  BinaryLogClient - Connected to hdp103:3306 at mysql-bin.000003/3730 (sid:6379, cid:103)
21:10:30,301 INFO  BinlogConnectorReplicator - Binlog connected.
21:10:30,999 INFO  AbstractSchemaStore - storing schema @Position[BinlogPosition[mysql-bin.000003:3826], lastHeartbeat=0] after applying "CREATE TABLE `test_copy1` (   `id` bigint(20) DEFAULT NULL,   `name` varchar(200) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8" to test_maxwell, new schema id is 2
{"database":"test_maxwell","table":"test_copy1","type":"insert","ts":1653570377,"xid":5172,"xoffset":0,"data":{"id":2,"name":"bbb"}}
{"database":"test_maxwell","table":"test_copy1","type":"insert","ts":1653570377,"xid":5172,"xoffset":1,"data":{"id":3,"name":"ccc"}}
{"database":"test_maxwell","table":"test_copy1","type":"insert","ts":1653570377,"xid":5172,"xoffset":2,"data":{"id":4,"name":"ddd"}}
{"database":"test_maxwell","table":"test_copy1","type":"insert","ts":1653570377,"xid":5172,"xoffset":3,"data":{"id":5,"name":"eee"}}
{"database":"test_maxwell","table":"test_copy1","type":"insert","ts":1653570377,"xid":5172,"xoffset":4,"data":{"id":6,"name":"fff"}}
{"database":"test_maxwell","table":"test_copy1","type":"insert","ts":1653570377,"xid":5172,"xoffset":5,"data":{"id":23,"name":"dd"}}
{"database":"test_maxwell","table":"test_copy1","type":"insert","ts":1653570377,"xid":5172,"commit":true,"data":{"id":7,"name":"ggg"}}
21:10:31,127 INFO  AbstractSchemaStore - storing schema @Position[BinlogPosition[mysql-bin.000003:4484], lastHeartbeat=0] after applying "RENAME TABLE `test_maxwell`.`test_copy1` TO `test_maxwell`.`test2`" to test_maxwell, new schema id is 3
21:10:31,161 INFO  AbstractSchemaStore - storing schema @Position[BinlogPosition[mysql-bin.000003:4705], lastHeartbeat=0] after applying "RENAME TABLE `test_maxwell`.`test2` TO `test_maxwell`.`test1`" to test_maxwell, new schema id is 4
{"database":"test_maxwell","table":"test1","type":"insert","ts":1653570412,"xid":5279,"commit":true,"data":{"id":1,"name":"nihao"}}

(3) 当数据全部初始化完成以后,Maxwell 的元数据会变化

is_complete 字段从 0 变为 1

start_at 字段从 null 变为具体时间(数据同步开始时间)

complete_at 字段从 null 变为具体时间(数据同步结束时间)

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

yiluohan0307

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值