kudu导出为mysql格式_利用StreamSets实现MySQL中变化数据实时写入Kudu

环境准备

开启MariaDB的Binlog日志

修改/etc/my.conf文件,在配置文件[mysqld]下增加如下配置

server-id=999

log-bin=mysql-bin

binlog_format=ROW

注意:

MySQL Binlog支持多种数据更新格式包括Row、Statement和mix(Row和Statement的混合),这里建议使用Row模式的Binlog格式,可以更加方便实时的反应行级别的数据变化。

[root@node01 mariadb]# systemctl restart mysqld

[root@node01 mariadb]# systemctl status mysqld

● mysqld.service - LSB: start and stop MariaDB

Loaded: loaded (/etc/rc.d/init.d/mysqld; bad; vendor preset: disabled)

Active: active (running) since Tue 2020-04-28 09:59:03 CST; 1min 11s ago

Docs: man:systemd-sysv-generator(8)

Process: 12771 ExecStop=/etc/rc.d/init.d/mysqld stop (code=exited, status=0/SUCCESS)

Process: 12925 ExecStart=/etc/rc.d/init.d/mysqld start (code=exited, status=0/SUCCESS)

CGroup: /system.slice/mysqld.service

├─12970 /bin/sh /usr/local/mariadb/bin/mysqld_safe --datadir=/usr/local/mariadb/data --pid-file=/usr/local/mariadb/data/node01.pid

└─13079 /usr/local/mariadb/bin/mysqld --basedir=/usr/local/mariadb --datadir=/usr/local/mariadb/data --plugin-dir=/usr/local/mariadb/lib/plugin --user=mysql --log-error=/usr/l...

Apr 28 09:59:02 node01 systemd[1]: Starting LSB: start and stop MariaDB...

Apr 28 09:59:02 node01 mysqld[12925]: Starting MariaDB.200428 09:59:02 mysqld_safe Logging to '/usr/local/mariadb/data/node01.err'.

Apr 28 09:59:02 node01 mysqld[12925]: 200428 09:59:02 mysqld_safe Starting mysqld daemon with databases from /usr/local/mariadb/data

Apr 28 09:59:03 node01 mysqld[12925]: [ OK ]

Apr 28 09:59:03 node01 systemd[1]: Started LSB: start and stop MariaDB.

创建 MariaDB 同步账号

GRANT ALL on maxwell.* to 'maxwell'@'%' identified by '123456';

GRANT SELECT, REPLICATION CLIENT, REPLICATION SLAVE on *.* to 'maxwell'@'%';

FLUSH PRIVILEGES;

创建测试表

在MariaDB数据库中创建测试表

CREATE DATABASE IF NOT EXISTS test;

CREATE TABLE test.`binlog_test` (

`id` int(11) NOT NULL AUTO_INCREMENT,

`name` varchar(255) NOT NULL,

PRIMARY KEY (`id`) USING BTREE

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

在 HUE 上创建 KUDU 表

CREATE DATABASE IF NOT EXISTS test;

CREATE TABLE IF NOT EXISTS test.binlog_test (

id int,

name String,

PRIMARY key(id)

)

PARTITION BY HASH PARTITIONS 16

STORED AS KUDU;

创建 StreamSets 的 Pipline

创建一个新的 Pipline

ad291fb0b414

image

选择 Origins 类别,搜索 MySQL Binary Log,并拖动到画布

ad291fb0b414

image

配置 MySQL Binary Log 基本信息

ad291fb0b414

image

**配置 MySQL 连接信息**

**注意:** 此处配置的 Server ID 应与 MySQL 的 my.cnf 文件中的 server-id 保持一致。

![image](https://upload-images.jianshu.io/upload_images/18545623-f43db30aa5089c27?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

**配置 MySQL 账号信息**

![image](https://upload-images.jianshu.io/upload_images/18545623-33d3abeb44e632e3?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

**高级配置,根据自己的需要进行配置,这里采用默认**

![image](https://upload-images.jianshu.io/upload_images/18545623-59232a75d0526241?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

**参考:**

https://streamsets.com/documentation/datacollector/latest/help/datacollector/UserGuide/Origins/MySQLBinaryLog.html?hl=mysql%2Cbinary%2Clog

添加表过滤的Stream Selector 1

ad291fb0b414

image

配置 Stream Selector 基本信息

ad291fb0b414

image

配置分流条件

${record:value("/Table") == "binlog_test"}

ad291fb0b414

image

添加处理日志 JavaScript Evaluator

添加解析 DELETE 类型的Binary Log 日志的 JavaScript Evaluator

ad291fb0b414

image

配置JavaScript脚本

for(var i = 0; i < records.length; i++) {

try {

var newRecord = sdcFunctions.createRecord(true);

newRecord.value = records[i].value['Data'];

newRecord.value.Type = records[i].value['Type'];

newRecord.value.Database = records[i].value['Database'];

newRecord.value.Table = records[i].value['Table'];

log.info(records[i].value['Type'])

output.write(newRecord);

} catch (e) {

// Send record to error

error.write(records[i], e);

}

}

ad291fb0b414

image

**添加解析 INSRET 和 UPDATE 类型日志的 JavaScript Evaluator**

![image](https://upload-images.jianshu.io/upload_images/18545623-30378a88b8f7f613?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

**配置JavaScript脚本**

```

for(var i = 0; i < records.length; i++) {

try {

var newRecord = sdcFunctions.createRecord(true);

newRecord.value = records[i].value['OldData'];

newRecord.value.Type = records[i].value['Type'];

newRecord.value.Database = records[i].value['Database'];

newRecord.value.Table = records[i].value['Table'];

log.info(records[i].value['Type'])

output.write(newRecord);

} catch (e) {

// Send record to error

error.write(records[i], e);

}

}

```

![image](https://upload-images.jianshu.io/upload_images/18545623-8492b6202b22945e?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

**参考:**

https://streamsets.com/documentation/datacollector/latest/help/datacollector/UserGuide/Processors/JavaScript.html?hl=javascript%2Cevaluator

添加 KUDU

配置 Kudu Delete

配置Kudu基本属性

ad291fb0b414

image

配置Kudu环境信息

ad291fb0b414

image

Kudu的高级配置,这里使用默认配置

ad291fb0b414

image

配置 Kudu Upsert

配置Kudu基本属性

ad291fb0b414

image

配置Kudu环境信息

ad291fb0b414

image

Kudu的高级配置,这里使用默认配置

ad291fb0b414

image

校验 Pipelines 配置

ad291fb0b414

image

启动 Pipelines

ad291fb0b414

image

Pipeline 流程测试

新增数据

向 MariaDB 中的 binlog_test 表中插入数据

insert into test.binlog_test values(1, '张三');

在 StreamSets 中查看的 Pipeline 状态

ad291fb0b414

image

ad291fb0b414

image

使用Hue查看 Kudu 表数据,验证数据是否成功插入

ad291fb0b414

image

更新数据

更新 MariaDB 中的 binlog_test 表中数据

update test.binlog_test set name='李四' where id = 1;

在 StreamSets 中查看的 Pipeline 状态

ad291fb0b414

image

ad291fb0b414

image

使用Hue查看 Kudu 表数据,验证数据是否成功更新

ad291fb0b414

image

删除数据

删除 MariaDB 中的 binlog_test 表中数据

delete from test.binlog_test where id = 1;

在 StreamSets 中查看的 Pipeline 状态

ad291fb0b414

image

ad291fb0b414

image

使用Hue查看 Kudu 表数据,验证数据是否成功删除

ad291fb0b414

image

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值