flume通过flume-ng-sql-source实时采集mysql数据到kafka
1.通过flume-ng-sql-source实时将mysql数据传输到kafka中
需要用到插件flume-ng-sql-source
2.将下载的jar包放到flume的lib目录
注意:不同插件版本与不同的flume版本存在匹配问题(亲测flume-ng-sql-source-1.5.1.jar与flume-1.8.0可以共同使用)
3.在${FLUME_HOME}/conf下创建mysql_conf.properties
以下为我的配置,请自行修改
a1.sources = sql-source
a1.channels = ch1
a1.sinks = kafka
a1.sources.sql-source.channels=ch1
a1.sources.sql-source.type = org.keedio.flume.source.SQLSource
a1.sources.sql-source.hibernate.connection.url = jdbc:mysql://192.168.1.185:3306/dataassets
#Hibernate Database connection properties
a1.sources.sql-source.hibernate.connection.user = root
a1.sources.sql-source.hibernate.connection.password = zhbr@2020
a1.sources.sql-source.hibernate.connection.autocommit = true
a1.sources.sql-source.hibernate.dialect = org.hibernate.dialect.MySQL5Dialect
a1.sources.sql-source.hibernate.connection.driver_class = com.mysql.jdbc.Driver
a1.sources.sql-source.table = datadictionary
a1.sources.sql-source.columns.to.select = *
a1.sources.sql-source.incremental.column.name = id
a1.sources.sql-source.incremental.value = 0
a1.sources.sql-source.run.query.delay=5000
a1.sources.sql-source.status.file.path = /opt/standalone/apache-flume-1.8.0-bin/status
a1.sources.sql-source.status.file.name = sql-source.status
a1.channels.ch1.type = memory
a1.channels.ch1.capacity = 10000
a1.channels.ch1.transactionCapacity = 10000
a1.channels.ch1.byteCapacityBufferPercentage = 20
a1.channels.ch1.byteCapacity = 800000
a1.sinks.kafka.channel = ch1
a1.sinks.kafka.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.kafka.topic = kafkatopic
a1.sinks.kafka.brokerList = 192.168.1.183:9092
a1.sinks.kafka.requiredAcks = 1
a1.sinks.kafka.batchSize = 20
4.添加mysql驱动到flume的lib目录下
5.添加kafka的topic
kafka-topics.sh --create --partitions 1 --replication-factor 1 --topic kafkatopic --zookeeper cdh01:2181
6.启动flume agent
./flume-ng agent -n a1 -c conf -f ../conf/mysql_conf.properties -Dflume.root.logger=INFO,console
7.查看topic数据
./kafka-console-consumer.sh --bootstrap-server cdh01:9092 --topic kafkatopic --from-beginning
此时我们可以看到源源不断的数据从flume向kafka中传递了。