Kafka+Spark-Streaming实现流式计算(WordCount)
1.所需jar包下载
-
将/home/DYY/spark/kafka_2.12-3.0.0/libs/目录下的kafka-clients-3.0.0.jar拷贝到/home/DYY/spark/spark-3.1.1-bin-hadoop2.7/jars目录下
2.编写生产者程序
- 自己选择一个文件夹创建目录:(注意:mkdir -p /kafka-Spark-Streaming/wordcount/src/main/scala/是错误的)
mkdir -p kafka-Spark-Streaming/wordcount/src/main/scala/
- 进入 wordcount/src/main/scala 目录下
cd kafka-Spark-Streaming/wordcount/src/main/scala/
- 编写生产者程序( vim KafkaWordProducer.scala )
vim KafkaWordProducer.scala
import java.util.HashMap
import org.apache.kafka.clients.producer.{KafkaProducer, ProducerConfig, ProducerRecord}
import org.apache.spark.SparkConf
import org.apache.spark.streaming._
import org.apache.spark.stre