阿里云数据传入kafka之后flink进行计算

最新推荐文章于 2023-03-28 07:58:00 发布

Jatham_C

最新推荐文章于 2023-03-28 07:58:00 发布

阅读量864

点赞数

文章标签： kafka

本文链接：https://blog.csdn.net/qq_42784606/article/details/104557429

版权

kafka，flink，阿里云平台数据准备

springboot整合kafka
flink官网
 阿里云AMQP接入说明

springboot整合kafka

flink for Kafka

<dependencies>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.4</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-core</artifactId>
            <version>1.9.2</version>
            <scope>compile</scope>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-connector-kafka -->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-connector-kafka_2.11</artifactId>
            <version>1.9.2</version>
        </dependency>


        <!-- https://mvnrepository.com/artifact/org.apache.kafka/kafka_2.11 -->
        <dependency>
            <groupId>org.apache.kafka</groupId>
            <artifactId>kafka_2.11</artifactId>
            <version>2.4.0</version>
            <scope>compile</scope>
        </dependency>

        <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-clients -->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-clients_2.11</artifactId>
            <version>1.9.2</version>
        </dependency>


        <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-java -->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-java</artifactId>
            <version>1.9.2</version>
        </dependency>
        <!-- flink-streaming的jar包，2.11为scala版本号 -->
        <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-streaming-java -->
        <dependency>
            <groupId>org.apache.flink</groupId>
            <artifactId>flink-streaming-java_2.11</artifactId>
            <version>1.9.2</version>
            <scope>provided</scope>
        </dependency>
 </dependencies>

代码

import org.apache.flink.api.common.serialization.SimpleStringSchema;
import org.apache.flink.api.common.typeinfo.Types;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer;
import org.apache.flink.util.Collector;

import java.util.Properties;

public class FlinkForKafka {

    public static void main(String[] args) throws Exception {

        // 创建Flink执行环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

        // Kafka参数
        Properties properties = new Properties();
        properties.setProperty("bootstrap.servers", "IP:9092");
        properties.setProperty("group.id", "default");//消费组的ID
        String inputTopic = "test"; // topic
        String outputTopic = "WordCount";

        // Source
        FlinkKafkaConsumer<String> consumer =
                new FlinkKafkaConsumer<String>(inputTopic, new SimpleStringSchema(), properties);
        DataStream<String> stream = env.addSource(consumer);

        // Transformations
        // 使用Flink算子对输入流的文本进行操作
        // 按空格切词、计数、分区、设置时间窗口、聚合
        DataStream<Tuple2<String, Integer>> wordCount = stream
                .flatMap((String line, Collector<Tuple2<String, Integer>> collector) -> {
                    String[] tokens = line.split("\\s");
                    // 输出结果 (word, 1)
                    for (String token : tokens) {
                        if (token.length() > 0) {
                            collector.collect(new Tuple2<>(token, 1));
                        }
                    }
                })
                .returns(Types.TUPLE(Types.STRING, Types.INT))
                .keyBy(0)
                .timeWindow(Time.seconds(5))
                .sum(1);

        // Sink
        wordCount.print();

        // execute
        env.execute("kafka streaming word count");

    }
}

1.出现如下错误
在这里插入图片描述
需要导入flink的lib

kafka生产者消费者准备

消费者

@KafkaListener(topics = "test")
    public void consume(String message){
        System.out.println("receive msg "+ message);
    }

生产者

@RestController
public class KafkaProducer  {

    private static final String MY_TOPIC = "test";
    private String msg;

    @Autowired
    KafkaTemplate kafkaTemplate;

    @PostMapping(value = "/kafka")
    public void produce(@RequestParam(value = "msg")String msg){
        kafkaTemplate.send(MY_TOPIC,msg);
    }
}

遇到问题

kafkaTemplate报空指针异常
加入如下代码予以解决

    @Autowired
    KafkaTemplate kafkaTemplate;

    public static  AmqpJavaClientDemo amqpJavaClientDemo;
    @PostConstruct
    public void init() {
        amqpJavaClientDemo = this;
        amqpJavaClientDemo.kafkaTemplate = this.kafkaTemplate;
    }

应用

amqpJavaClientDemo.kafkaTemplate.send("test",content);

解释
需要注解
@Component
@Autowired 自动装配
@PostConstruct
被@PostConstruct修饰的方法会在服务器加载Servlet的时候运行，并且只会被服务器调用一次，类似于Serclet的inti()方法。被@PostConstruct修饰的方法会在构造函数之后，init()方法之前运行。

实验效果

阿里云服务器调用数据存入kafka，flink通过topic进行实时消费
在这里插入图片描述

打包jar时的配置需加上

<build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.5</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                    <encoding>UTF-8</encoding>
                </configuration>
            </plugin>
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <configuration>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <archive>
                        <manifest>
                            <mainClass>FlinkForKafka</mainClass>
                        </manifest>
                    </archive>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

版本对应后上传flink web UI 界面效果

在这里插入图片描述
使用Flink提供的命令行工具 flink ，将我们刚刚打包好的作业提交到集群上。命令行的参数 --class 用来指定哪个主类作为入口。我们之后会介绍命令行的具体使用方法。

$ bin/flink run --class com.flink.tutorials.java.api.projects.wordcount.WordCountKafkaInStdOut /Users/luweizheng/Projects/big-data/flink-tutorials/target/flink-tutorials-0.1.jar
复制代码
这时，仪表盘上就多了一个Flink程序。

程序的输出会打到Flink主目录下面的 log 目录下的.out文件中，使用下面的命令查看结果：

$ tail -f log/flink--taskexecutor-.out

在这里插入图片描述

关闭job

webUI 就点cancel就好了
在命令行停止：

先查询目前在运行的job任务列表

执行bin/flink list命令，发现有一个正在运行的job
然后./bin/flink cancel
在这里插入图片描述

Jatham_C

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
阿里云数据传入kafka之后flink进行计算

kafka，flink，阿里云平台数据准备springboot整合kafkaflink官网阿里云AMQP接入说明springboot整合kafkaspringboot整合kafkaflink for Kafka<dependencies> <dependency> <groupId>junit</gro...
复制链接

扫一扫