mysql到es的实时增量同步工具canal配置步骤

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/huxuanlai/article/details/86614401

#1. 在自己的测试机器上,利用修改过的canal-1.1.2代码创建发布包,修改的源码和系统各组件的版本可通过第二次commit看到,jdk为1.8
git clone git@github.com:huxuanlai/canal-1.1.2-mysql-es.git
mvn clean install -Denv=release
在canal-1.1.2-mysql-es/target目录中找到deployer和adapter的tar.gz包

#2. 模拟的测试环境配置
#2.1 在mysql所在服务器上创建canal账户,canal server以此账号作为mysql的假slave获取binlog事件数据
ssh xxx-xxx199-xxx
mysql -u root -p root

CREATE USER canal IDENTIFIED BY 'canal';
GRANT SELECT, SHOW VIEW, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'canal'@'%';
-- GRANT ALL PRIVILEGES ON *.* TO 'canal'@'%' ; 需要具有SHOW VIEW 权限
FLUSH PRIVILEGES;
quit;

vim /etc/my.cnf
binlog_format=row
/etc/init.d/xxx-mysqld restart
通过在mysql中执行下列语句查看是否修改成功,应该为row,不应该为mixed等。
mysql -u root -p root
show variables like 'binlog_format';

#2.2 在客户的一台服务器上用root创建canal用户
ssh xxx-xxx197-xxx
useradd -m canal
passwd canal
su canal
#将步骤1编译出的deployer(即canal server)和adapter(即canal client es adapter)tar.gz文件拷贝到此处
mkdir canal.deployer-1.1.2
mkdir canal.adapter-1.1.2
tar -xzvf canal.deployer-1.1.2.tar.gz -C canal.deployer-1.1.2
tar -xzvf canal.adapter-1.1.2.tar.gz -C canal.adapter-1.1.2

配置canal server和client
cd ~/canal.deployer-1.1.2/conf
# 访问arima服务器 http://x.x.x.194:8080 查看arima中zk和kafka服务器地址和端口配置
vim canal.properties
修改参数
canal.zkServers = xxx-xxx194-xxx:2181,xxx-xxx195-xxx:2181,xxx-xxx196-xxx:2181
canal.serverMode = kafka
canal.mq.servers = xxx-xxx195-xxx:6667,xxx-xxx196-xxx:6667,xxx-xxx197-xxx:6667

vim ~/canal.deployer-1.1.2/conf/example/instance.properties
修改参数
canal.instance.master.address=xxx-xxx197-xxx:3306
canal.instance.dbUsername=canal
canal.instance.dbPassword=canal
canal.instance.connectionCharset = UTF-8
canal.instance.defaultDatabaseName =test
canal.instance.filter.regex=test\\..*

vim ~/canal.adapter-1.1.2/conf/application.yml
修改参数
mode: kafka
  srcDataSources:
    defaultDS:
      url: jdbc:mysql://xxx-xxx199-xxx:3306/test?useUnicode=true
      username: root
      password: root
hosts: xxx-xxx195-xxx:9300,xxx-xxx196-xxx:9300,xxx-xxx198-xxx:9300
cluster.name: wxjtest (通过http://x.x.x.198:9200/可看到es集群名)

#2.3. mysql、kafka、zk、jdk 版本兼容
在系统jdk1.7的情况下支持jdk1.8版本下编译和运行的canal
因为canal默认在jdk8下编译,所以需要canal在jdk8下运行,需要专门为canal下载配置jdk8环境
下载jdk8: https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
把文件解压到/home/canal目录下
vim ~/canal.adapter-1.1.2/bin/startup.sh
在ALIBABA_JAVA="/usr/alibaba/java/bin/java"这行前添加:
JAVA="/home/canal/jdk1.8.0_191/bin/java"

#3. 在arima中查看确保mysql, es, kafka, zk已经正确启动,然后启动canal server和client
ssh xxx-xxx197-xxx
su canal
cd ~/canal.deployer-1.1.2
bin/startup.sh

cd ~/canal.adapter-1.1.2
bin/startup.sh

#4. 系统正确性测试
#4.1 各组件已启动
在arima服务器中看kakka, zk状态为绿色,
到mysql所在机器上用ps查看mysql进程,
在canal server(depolyer)和canal client(adapter)所在服务器上用jps查看进程存在

#4.2 mysql的binlog事件可通过canal server正确进入kafka的topic中
kafka主题测试:监控主题下binlog_format格式为row的增删改binlog事件
ssh xxx-xxx197-xxx
cd /usr/hdp/current/kafka-broker
bin/kafka-console-consumer.sh --bootstrap-server xxx-xxx195-xxx:6667,xxx-xxx196-xxx:6667,xxx-xxx197-xxx:6667 --zookeeper xxx-xxx194-xxx:2181,xxx-xxx195-xxx:2181,xxx-xxx196-xxx:2181/kafka --topic example

#4.3 kafka的topic表过滤正常(即canal.instance.filter.regex=test\\..* 配置正常),增删改数据符合预期
ssh xxx-xxx199-xxx
mysql -u root -p root
use test;
create table t6(
id int,
c1 text,
primary key(id)
);
insert into t6 values(1, "你好");
insert into t6 values(2, "你好");
insert into t6 values(3, "你好");
#需要提前创建es field mapping
curl -X PUT "http://x.x.x.196:9200/test_t6" -H 'Content-Type: application/json' -d'
{
"settings" : {"number_of_shards" : 1},
"mappings": {"_doc": {"properties": {"id": {"type": "text"},"c1": {"type": "text"}}}}
}'
#在conf/es目录外的其他目录编辑yml文件,因为系统会监控conf/es下的每个改动,编辑多次会加载多次还会经常导致monitor错误挂掉。
cp conf/es/mytest_user.yml  mytest_t6.yml
vim mytest_t6.yml
例如:
dataSourceKey: defaultDS
destination: example #如果修改此处destination为xxx,则canal.adapter-1.1.2/conf/目录下应建立xxx目录和对应的instance.properties
esMapping:
  _index: test_t6
  _type: _doc
  _id: id
#  pk: id
  sql: "select id, c1 from t6"
#  objFields:
#    _labels: array:;
  etlCondition: ""
  commitBatch: 3000

mv mytest_t6.yml conf/es/
curl -XPOST http://127.0.0.1:8081/etl/es/mytest_t6.yml
insert into t6 values(4, "你好");
curl -XGET 'http://x.x.x.195:9200/_cat/indices/*?v&s=index&pretty=true'
增删改查测试:
insert into t6 values(5, '你好')
delete from t6 where id = 5;
update t6 set c1='我爱《三国演义》' where id = 4;

#4.4 canal client可通过访问canal server对应的kafka topic,把topic内容通过es mapping插入到es的对应索引中
可查看所有索引,mysql中插入/删除一行数据,用下面es api可看到文档数增加/减少1:
curl -XGET 'http://x.x.x.195:9200/_cat/indices/*?v&s=index&pretty=true'

#4.5 es索引中数据正常,查询正常
可查看此索引下所有文档:
curl -XGET "x.x.x.196:9200/test_t6/_search?pretty=true"
mysql的test数据库t6表中每增删改一行数据,在es索引中可见,比如test_t6中_id为5的文档在增加后用es api可查出来
curl -XGET "x.x.x.196:9200/test_t6/_doc/5?pretty=true"

展开阅读全文

没有更多推荐了,返回首页