Spark支持Kafka
网上这块资料比较多,不再赘述
1.spark-streaming-kafka-0-8_2.11-2.1.0.jar
2.kafka 的jar 包
3.jar存放路径 spark/jars/kafka
生产者
import org.apache.spark.streaming.kafka._
import org.apache.kafka.clients.producer.{KafkaProducer, ProducerConfig, ProducerRecord}
import org.json4s.jackson.Serialization.write
val props = new HashMap[String, Object]()
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "127.0.0.1:9092")
// value的解析序列化接口实现类(Deserializer)
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringSerializer")
// key的解析序列化接口实现类(Deserializer)
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringSerializer")
// 实例化一个Kafka生产者
val producer = new KafkaProducer[String, String](props)
// rdd.colect即将rdd中数据转化为数组,然后write函数将rdd内容