spark 简单实战_spark简单wordcount实例

docker

docker pull sequenceiq/spark:1.6.0

docker run -it -p 8088:8088 -p 8042:8042 -p 4040:4040 -h sandbox sequenceiq/spark:1.6.0 bash

maven

org.springframework.boot

spring-boot-starter-actuator

org.springframework.boot

spring-boot-starter-test

test

org.apache.spark

spark-core_${scala.binary.version}

${spark.version}

org.slf4j

slf4j-log4j12

log4j

log4j

org.apache.spark

spark-streaming_${scala.binary.version}

${spark.version}

commons-logging

commons-logging

org.apache.spark

spark-streaming-kafka_${scala.binary.version}

${spark.version}

com.fasterxml.jackson.module

jackson-module-scala_${scala.binary.version}

2.7.3

com.fasterxml.jackson.module

jackson-module-jaxb-annotations

2.7.4

com.fasterxml.jackson.core

jackson-databind

2.7.4

com.fasterxml.jackson.core

jackson-annotations

2.7.4

版本

2.10

1.6.1

主要需要引入jackson的scala版本,否则报错如下:

Exception in thread "main" java.lang.VerifyError: class com.fasterxml.jackson.module.scala.ser.ScalaIteratorSerializer overrides final method withResolved.(Lcom/fasterxml/jackson/databind/BeanProperty;Lcom/fasterxml/jackson/databind/jsontype/TypeSerializer;Lcom/fasterxml/jackson/databind/JsonSerializer;)Lcom/fasterxml/jackson/databind/ser/std/AsArraySerializerBase;

streaming

/**

* nc -lk 9999

* http://192.168.0.102:4040

*/

public void start(){

ch.qos.logback.classic.Logger root = (ch.qos.logback.classic.Logger)LoggerFactory.getLogger(Logger.ROOT_LOGGER_NAME);

root.setLevel(Level.WARN);

SparkConf sparkConf = new SparkConf().setMaster("local[2]").setAppName("NetworkWordCount");

JavaStreamingContext streamingContext = new JavaStreamingContext(sparkConf, Durations.seconds(1));

// Create a DStream that will connect to hostname:port, like localhost:9999

JavaReceiverInputDStream lines = streamingContext.socketTextStream("localhost", 9999);

// Split each line into words

JavaDStream words = lines.flatMap(

new FlatMapFunction() {

@Override public Iterable call(String x) {

LOGGER.debug("flatMap called -> [{}]", x);

return Arrays.asList(x.split(" "));

}

});

// Count each word in each batch

JavaPairDStream pairs = words.mapToPair(

new PairFunction() {

@Override public Tuple2 call(String s) {

return new Tuple2(s, 1);

}

});

JavaPairDStream wordCounts = pairs.reduceByKey(

new Function2() {

@Override public Integer call(Integer i1, Integer i2) {

return i1 + i2;

}

});

// Print the first ten elements of each RDD generated in this DStream to the console

wordCounts.print();

streamingContext.start(); // Start the computation

streamingContext.awaitTermination(); // Wait for the computation to terminate

}

run

启动netcat

nc -lk 9999

运行,然后输入空格间隔的字符串,然后打开spark-ui

http://192.168.0.102:4040/

docs

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值