Spark
hxhh
分享一些工作实用技巧和干货,共勉
展开
-
Spark Streaming整合flume(Poll方式和Push方式)
flume作为日志实时采集的框架,可以与SparkStreaming实时处理框架进行对接,flume实时产生数据,sparkStreaming做实时处理。Spark Streaming对接FlumeNG有两种方式,一种是FlumeNG将消息Push推给Spark Streaming,还有一种是Spark Streaming从flume 中Poll拉取数据。6.1 Poll方式(1)安装flu...原创 2018-10-18 10:58:13 · 919 阅读 · 0 评论 -
Spark集群搭建
1.1下载spark安装包下载地址spark官网:http://spark.apache.org/downloads.html这里我们使用spark-2.0.2-bin-hadoop2.7版本.3.2规划安装目录/opt/bigdata1.3解压安装包tar-zxvf spark-2.0.2-bin-hadoop2.7.tgz1.4重命名目录mv spark-2.0.2-bin...原创 2018-10-15 10:20:43 · 110 阅读 · 0 评论 -
搭建Spark高可用集群
通过zookeeper搭建高可用spark集群1、需要搭建一个zk集群2、配置文件修改(spark-env.sh)注释掉export SPARK_MASTER_HOST=hdp-node-01添加 SPARK_DAEMON_JAVA_OPTSexport SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -D...原创 2018-10-15 11:10:30 · 221 阅读 · 0 评论 -
org.apache.spark.examples.SparkPi
Warning: Local jar /usr/local/spark/ does not exist, skipping.java.lang.ClassNotFoundException: org.apache.spark.examples.SparkPiat java.net.URLClassLoader.findClass(URLClassLoader.java:381)at jav...原创 2018-10-15 12:38:48 · 9240 阅读 · 10 评论 -
org.apache.spark.SparkException: Exception thrown in awaitResult (Spark报错)
WARN StandaloneAppClient$ClientEndpoint: Failed to connect to master node1:7077org.apache.spark.SparkException: Exception thrown in awaitResult at org.apache.spark.rpc.RpcTimeout$$anonfun$1.a...原创 2018-10-20 09:18:49 · 26954 阅读 · 0 评论 -
TaskSchedulerImpl: Initial job has not accepted any resources (Spark报错)
18/10/20 10:08:16 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources如图所示:执行脚本的时候报错报错原因:...原创 2018-10-20 10:54:38 · 1069 阅读 · 0 评论