canal配置MySQL数据同步到ES
参考:canal实时同步mysql数据到elasticsearch(部署,配置,测试)(一)_使用canal同步mysql数据到elasticsearch-CSDN博客
需要注意的是:
es7路径下的配置文件中,SQL必须要用表别名,如果没有别名,会提示空指针错误。
dataSourceKey: defaultDS # 源数据源的key, 对应上面配置的srcDataSources中的值
destination: example # canal的instance或者MQ的topic
groupId: g1 # 对应MQ模式下的groupId, 只会同步对应groupId的数据
esVersion: es7
esMapping:
_index: test_book # es 的索引名称
_id: _id # es 的_id, 如果不配置该项必须配置下面的pk项_id则会由es自动分配
sql: "SELECT
b.id AS _id,
b.title,
b.author,
b.isbn,
b.publisher_name as publisherName
FROM
test_book b" # sql映射
etlCondition: "where p.id>={}" #etl的条件参数
commitBatch: 5000 # 提交批大小
如何添加多个MySQL的数据库作为数据源呢?
在adapter中修改配置文件 vi /conf/application.yml,srcDataSources下面可以添加多个jdbc链接
srcDataSources:
test1:
url: jdbc:mysql://10.102.1.38:3306/lj_bsd?useSSL=false
username: root
password: ******
test2:
url: jdbc:mysql://10.102.1.38:3306/lj_esd?useSSL=false
username: root
password: ******
test2:
url: jdbc:mysql://10.102.1.38:3306/lj_lsd?useSSL=false
username: root
password: ******
相应地,es7目录下也可以添加多个数据同步配置文件,在文件中写不同的数据库就可以了。
dataSourceKey: test2 // 选择test2作为数据库
destination: example
groupId: g1
esMapping:
_index: full_text_airport_info
_id: ID
# upsert: true
# pk: id
sql: "SELECT
a.ID as ID,
a.AIRPORT_NAME as AIRPORT_NAME,
a.AIRPORT_LEVEL as AIRPORT_LEVEL,
a.PHONE as PHONE,
a.ADDRESS as ADDRESS,
a.BELONGED_ORGAN as BELONGED_ORGAN,
a.COORDINATE as COORDINATE,
a.OPERATOR_USER as OPERATOR_USER,
a.CREATE_TIME as CREATE_TIME,
a.UPDATE_TIME as UPDATE_TIME,
a.DATA_STATUS as DATA_STATUS,
a.MARK as MARK
FROM AIRPORT_INFO a ;"
# objFields:
# _labels: array:;
# etlCondition: "where a.c_time>={}"
commitBatch: 3000