Flink 分别读取kafka和mysql作为source

需求

首先从kafka中读取数据,然后从mysql中读取数据,然后将这两个数据进行合并处理。

环境

  • Flink 1.8.2

实现

public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment executionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment();

        String topic = "mytest";
        Properties properties = new Properties();
        properties.setProperty("bootstrap.servers", "swarm-manager:9092");
        properties.setProperty("group.id", "test");
        properties.setProperty("key.serializer", StringSerializer.class.getName());
        properties.setProperty("value.serializer", StringSerializer.class.getName());
		FlinkKafkaConsumer<String> flinkKafkaConsumer = new FlinkKafkaConsumer<>(topic, new SimpleStringSchema(), properties);
        //接收kafka
        DataStreamSource<String> kafkaDataSource = executionEnvironment.addSource(flinkKafkaConsumer);
		SingleOutputStreamOperator<String> kafkaData = data.map(new MapFunction<String, String>() {....});
		//接收mysql
		DataStreamSource<HashMap<String, String>> mysqlData = executionEnvironment.addSource(new MySqlSource());
}

上面可以实现分别从mysql和kafka中获取数据,并且都设置好数据源。
其中mysql数据源配置如下:

package com.vincent.project;

import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.util.HashMap;

public class MySqlSource extends RichParallelSourceFunction<HashMap<String, String>> {

    private PreparedStatement ps;
    private Connection connection;


    // 用来建立连接
    @Override
    public void open(Configuration parameters) throws Exception {
        connection = getConnection();
        String sql = "select user_id, domain from user_domain_config";
        ps = this.connection.prepareStatement(sql);
        System.out.println("open");
    }


    @Override
    public void close() throws Exception {
        if (ps != null) {
            ps.close();
        }
        if (connection != null) {
            connection.close();
        }

    }

    private Connection getConnection() {
        Connection con = null;
        try {
            Class.forName("com.mysql.cj.jdbc.Driver");
            String url = "jdbc:mysql://swarm-manager:3306/imooc_flink?useUnicode=true&characterEncoding=UTF-8";
            String username = "root";
            String password = "123456";
            con = DriverManager.getConnection(url, username, password);
        } catch (Exception e) {
            e.printStackTrace();
            System.out.println("-----------mysql get connection has exception , msg = " + e.getMessage());
        }
        return con;
    }

    /**
     * 此处是代码的关键,要从mysql表中,把数据读取出来,转成Map进行数据的封装
     * @param sourceContext
     * @throws Exception
     */
    @Override
    public void run(SourceContext<HashMap<String, String>> sourceContext) throws Exception {
        ResultSet resultSet = ps.executeQuery();
        HashMap<String, String> map = new HashMap<>();
        while (resultSet.next()) {
            String user_id = resultSet.getString("user_id");
            String domain = resultSet.getString("domain");
            map.put(domain, user_id);
        }
        sourceContext.collect(map);
    }


    @Override
    public void cancel() {

    }
}

连接两个数据源:

// CoFlatMapFunction 的第一个类型是logData的数据类型,第二个类型是Mysql的数据类型,第三个类型是输出类型
        SingleOutputStreamOperator<String> connectData = logData.connect(mysqlData).flatMap(new CoFlatMapFunction<Tuple3<Long, String, String>, HashMap<String, String>, String>() {

            HashMap<String, String> userDomainMap = new HashMap<>();

            //处理logData的
            @Override
            public void flatMap1(Tuple3<Long, String, String> longStringStringTuple3, Collector<String> collector) throws Exception {
                String domain = longStringStringTuple3.f1;
                String userId = userDomainMap.getOrDefault(domain, "");
                System.out.println("userID:" + userId);
                collector.collect(longStringStringTuple3.f0 + "\t" + longStringStringTuple3.f1 + "\t" + longStringStringTuple3.f2 + "\t" + userId);

            }

            //处理mysql的
            @Override
            public void flatMap2(HashMap<String, String> stringStringHashMap, Collector<String> collector) throws Exception {
                userDomainMap = stringStringHashMap;

            }
        });
        connectData.setParallelism(1).print();

使用connector链接两个数据源

好的,您可以按照以下步骤操作: 1. 在 Flink 中添加 KafkaMySQL 的依赖: ```xml <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-kafka_${scala.binary.version}</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>org.apache.flink</groupId> <artifactId>flink-connector-jdbc_${scala.binary.version}</artifactId> <version>${flink.version}</version> </dependency> <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>8.0.23</version> </dependency> ``` 2. 创建 Flink SQL 的执行环境: ```java StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); EnvironmentSettings settings = EnvironmentSettings.newInstance() .useBlinkPlanner() .inStreamingMode() .build(); StreamTableEnvironment tableEnv = StreamTableEnvironment.create(env, settings); ``` 3. 注册 Kafka 数据源和 MySQL 数据汇: ```java tableEnv.executeSql("CREATE TABLE kafka_source (\n" + " id INT,\n" + " name STRING,\n" + " age INT,\n" + " PRIMARY KEY (id) NOT ENFORCED\n" + ") WITH (\n" + " 'connector' = 'kafka',\n" + " 'topic' = 'test',\n" + " 'properties.bootstrap.servers' = 'localhost:9092',\n" + " 'properties.group.id' = 'testGroup',\n" + " 'format' = 'json',\n" + " 'scan.startup.mode' = 'earliest-offset'\n" + ")"); tableEnv.executeSql("CREATE TABLE mysql_sink (\n" + " id INT,\n" + " name STRING,\n" + " age INT,\n" + " PRIMARY KEY (id)\n" + ") WITH (\n" + " 'connector' = 'jdbc',\n" + " 'url' = 'jdbc:mysql://localhost:3306/test',\n" + " 'table-name' = 'user',\n" + " 'driver' = 'com.mysql.cj.jdbc.Driver',\n" + " 'username' = 'root',\n" + " 'password' = 'root'\n" + ")"); ``` 4. 使用 Flink SQL 读取 Kafka 数据源并将数据写入 MySQL 数据汇: ```java tableEnv.executeSql("INSERT INTO mysql_sink SELECT * FROM kafka_source"); env.execute(); ``` 这样就可以使用 Flink SQL 从 Kafka读取数据,并将数据写入 MySQL 数据库中了。
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值