纯小白,这篇文章所述过程仅作自己的记录,谨慎参考。如有错误或可改进之处,请各大神指点。
参考网址:
https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-1/
这个链接除了最后面在kafka上没显示出来,前面都一样啦啦啦
在kafka上显示是参照官方文档:
https://docs.confluent.io/current/connect/connect-jdbc/docs/source_connector.html
其中要知道哪个topic是用下面的链接
https://www.jianshu.com/p/70712aea94fc
中bin/kafka-topics --zookeeper localhost:2181 --list命令,列出了新的topic
(没全按这个链接是因为不知道properties文件怎么配置)
下面是具体步骤:
1、$ bin/confluent start
2、启动mysql $ mysql -u root -p
3、创建一个用户并赋予所有权限(主要是为了防止后面写json文件时我这种小白不知道用户名,密码,localhost到底指向哪)(可能这样创建了之后,properties文件也知道配置了,但我还没试)
CREATE USER 'rmoff'@'localhost' IDENTIFIED BY 'pw'; 用户名是rmoff,密码是pw
GRANT ALL PRIVILEGES ON *.* TO 'rmoff'@'localhost'; 赋予所有权限
4、写mysql
mysql> create database demo;
mysql> use demo;
mysql> create table foobar (c1 int,
c2 varchar(255),
create_ts timestamp DEFAULT CURRENT_TIMESTAMP ,
update_ts timestamp DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP );
mysql> show tables;
+----------------+
| Tables_in_demo |
+----------------+
| foobar |
+----------------+
mysql> insert into foobar (c1,c2) values(1,'foo');
mysql> select * from foobar;
+------+------+---------------------+---------------------+
| c1 | c2 | create_ts | update_ts |
+------+------+---------------------+---------------------+
| 1 | foo | 2018-08-02 14:28:29 | 2018-08-02 14:28:29 |
+------+------+---------------------+---------------------+
5、创建kafka connect source JDBC connect
把下面的内容写到/tmp/kafka-connect-jdbc-source.json,当然为了方便我就直接写到confluent-5.0.0目录下kafka-connect-jdbc-source.json
{
"name": "jdbc_source_mysql_foobar_01",
"config": {
"_comment": "The JDBC connector class. Don't change this if you want to use the JDBC Source.",
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"_comment": "How to serialise the value of keys - here use the Confluent Avro serialiser. Note that the JDBC Source Connector always returns null for the key ",
"key.converter": "io.confluent.connect.avro.AvroConverter",
"_comment": "Since we're using Avro serialisation, we need to specify the Confluent schema registry at which the created schema is to be stored. NB Schema Registry and Avro serialiser are both part of Confluent Open Source.",
"key.converter.schema.registry.url": "http://localhost:8081",
"_comment": "As above, but for the value of the message. Note that these key/value serialisation settings can be set globally for Connect and thus omitted for individual connector configs to make them shorter and clearer",
"value.converter": "io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://localhost:8081",
"_comment": " --- JDBC-specific configuration below here --- ",
"_comment": "JDBC connection URL. This will vary by RDBMS. Consult your manufacturer's handbook for more information",
"connection.url": "jdbc:mysql://localhost:3306/demo?user=rmoff&password=pw",
"_comment": "Which table(s) to include",
"table.whitelist": "foobar",
"_comment": "Pull all rows based on an timestamp column. You can also do bulk or incrementing column-based extracts. For more information, see http://docs.confluent.io/current/connect/connect-jdbc/docs/source_config_options.html#mode",
"mode": "timestamp",
"_comment": "Which column has the timestamp value to use? ",
"timestamp.column.name": "update_ts",
"_comment": "If the column is not defined as NOT NULL, tell the connector to ignore this ",
"validate.non.null": "false",
"_comment": "The Kafka topic will be made up of this prefix, plus the table name ",
"topic.prefix": "mysql-"
}
}
6、然后运行
$ bin/confluent load jdbc_source_mysql_foobar_01 -d /tmp/kafka-connect-jdbc-source.json 而我去掉了/tmp/
7、检查状态
./bin/confluent status jdbc_source_mysql_foobar_01
应该得到如下结果,
{
"name": "jdbc_source_mysql_foobar_01",
"connector": {
"state": "RUNNING",
"worker_id": "127.0.0.1:8083"
},
"tasks": [
{
"state": "RUNNING",
"id": 0,
"worker_id": "127.0.0.1:8083"
}
]
}
如果这里出错了应该先解决问题再往下进行
8、下面检查是否已经复制到了数据,第一个链接这里就行不通了
运行:
$ bin/kafka-avro-console-consumer \
> bootstrap-server localhost:9092 \
> property schema.registry.url=http://localhost:8081 \
> property print.key=true \
> from-beginning \
> topic mysql-foobar
报错:Exception in thread "main" org.apache.kafka.common.config.ConfigException: Missing schema registry url!
参照第三个链接,验证是否产生了相应的topic
$ bin/kafka-topics --zookeeper localhost:2181 --list
__confluent.support.metrics
__consumer_offsets
_confluent-ksql-default__command_topic
_schemas
connect-configs
connect-offsets
connect-statuses
mysql-foobar
最后一个mysql-foobar比较熟悉,应该是新产生的topic
9、验证topic中的数据
$ bin/kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic mysql-foobar --from-beginning
得到:{"c1":{"int":1},"c2":{"string":"foo"},"create_ts":{"long":1533220109000},"update_ts":{"long":1533220109000}}
10、验证可增量从MySQL中导入数据
重新启动mysql
$ mysql --user=rmoff --password=pw demo demo是database,foobar是table
mysql> insert into foobar (c1,c2) values(2,'foo');
mysql> insert into foobar (c1,c2) values(3,'foo');
mysql> update foobar set c2='bar' where c1=1;
mysql> select * from foobar;
+------+------+---------------------+---------------------+
| c1 | c2 | create_ts | update_ts |
+------+------+---------------------+---------------------+
| 1 | bar | 2018-08-02 14:28:29 | 2018-08-02 15:18:43 |
| 2 | foo | 2018-08-02 15:18:22 | 2018-08-02 15:18:22 |
| 3 | foo | 2018-08-02 15:18:33 | 2018-08-02 15:18:33 |
+------+------+---------------------+---------------------+
Confluent那边自己就显示出了mysql更新的内容。