1.题目要求:
/**
* 1. 根据这个Json文件统计手机用户用不同系统登录的次数
* 2. 统计手机用户在各省登录的次数
* 3. 将结果保存到Redis
* 注:可以使用NC方式进行生产消费数据
* phoneNum:手机号
* Terminal:类型
* province: 省(市)
* status表示状态(1表示登录,0表示未登录)
*/
2.数据格式:
{"openid":"opEu45VAwuzCsDr6iGIf4qhnUZUI","phoneNum":"18334832972","money":"100","date":"2018-09-14T02:15:16.054Z","lat":39.688011,"log":116.066689,"province":"北京市","city":"北京市","district":"房山区","terminal":"ios","status":"0"}
3.要点:
* fastJson处理json格式的文件
* spark与kafka结合处理流数据
* 保存到redis中
4.代码:
import com.alibaba.fastjson.JSON
import com.typesafe.config.ConfigFactory
import org.apache.commons.pool2.impl.GenericObjectPoolConfig
import org.apache.kafka.clients.consumer.ConsumerRecord
import org.apache.spark.SparkConf
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.SparkSession
import org.apache.spark.streaming.dstream.InputDStream
import org.apache.spark.streaming.kafka010.{
ConsumerStrategies, KafkaUtils, LocationStrategies}
import org.apache.spark.streaming.{
Durations, StreamingContext}
import redis.clients.jedis.{
Jedis<