SparkR 读取数据& Spark运行的配置

1.本地LOCAL环境安装Spark并试运行配置(在Ubuntu系统下例子)

# 打开文件配置环境变量: JAVA,SCALA,SPARK,HADOOP,SBT
gedit /etc/profile
 
# 在文件中加入以下行
export JAVA_HOME=/usr/java/jdk1.8.0_51
export PATH=$JAVA_HOME/bin: $PATH
export CLASSPATH= $CLASSPATH :$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export SCALA_HOME=/usr/scala/scala-2.11.7
export PATH=$SCALA_HOME/bin: $PATH
export SPARK_HOME=/usr/spark/spark-1.4.1 -bin -without -hadoop
export PATH=$SPARK_HOME/bin: $PATH
export SBT_HOME=/usr/scala/sbt
export PATH=$SBT_HOME/bin: $PATH
export HADOOP_HOME=/usr/hadoop/hadoop-2.7.0
export PATH=$HADOOP_HOME/bin: $PATH
export CLASSPATH= $CLASSPATH :$HADOOP_HOME/lib
 
# 更新系统文件
source /etc/profile

 

修改 Spark的配置文件 Spark-env.sh,将Spark-env.sh.template 文件修改名称并添加以下环境变量和类变量

export SCALA_HOME=/usr/scala/scala-2.11.7
export JAVA_HOME=/usr/java/jdk1.8.0_51
export HADOOP_CONF_DIR=/usr/hadoop/hadoop-2.7.0/etc/hadoop
export SPARK_LOCAL_IP=localhost
export SPARK_PUBLIC_DNS=localhost
 
export SPARK_CLASSPATH=${HADOOP_HOME}/share/hadoop/common/hadoop -common -2.7.0.jar:${HADOOP_HOME}/share/hadoop/common/hadoop -nfs -2.7.0.jar
export SPARK_CLASSPATH=${SPARK_CLASSPATH}:${HADOOP_HOME}/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar:${HADOOP_HOME}/share/hadoop/common/lib/slf4j -api -1.7.10.jar:${HADOOP_HOME}/share/hadoop/common/lib/log4j-1.2.17.jar:${HADOOP_HOME}/share/hadoop/common/lib/commons -configuration -1.6.jar:${HADOOP_HOME}/share/hadoop/common/lib/commons -collections -3.2.1.jar:${HADOOP_HOME}/share/hadoop/common/lib/guava-11.0.2.jar:${HADOOP_HOME}/share/hadoop/common/lib/commons -lang -2.6.jar:${HADOOP_HOME}/share/hadoop/common/lib/hadoop -auth -2.7.0.jar:${HADOOP_HOME}/share/hadoop/common/lib/jetty-6.1.26.jar
 
export SPARK_CLASSPATH=${SPARK_CLASSPATH}:${HADOOP_HOME}/share/hadoop/common/lib/jersey -server -1.9.jar:${HADOOP_HOME}/share/hadoop/common/lib/jersey -core -1.9.jar:${HADOOP_HOME}/share/hadoop/common/lib/jersey -json -1.9.jar:${HADOOP_HOME}/share/hadoop/common/lib/snappy -java -1.0.4.1.jar
 
export SPARK_CLASSPATH=${SPARK_CLASSPATH}:${HADOOP_HOME}/share/hadoop/mapreduce/hadoop -mapreduce -client -common -2.7.0.jar
 
export SPARK_CLASSPATH=${SPARK_CLASSPATH}:${SPARK_HOME}/lib/spark -assembly -1.4.1-hadoop2.2.0.jar:${SPARK_HOME}/lib/spark-1.4.1 -yarn -shuffle .jar:${SPARK_HOME}/lib/spark -examples -1.4.1-hadoop2.2.0.jar

 

当执行./bin/spark-shell 命令行后,出现以下界面代表本地模式成功启动了Spark

 

2.R执行Spark命令处理文件

library(SparkR)
# 新建一个SparkContent
sc <- sparkR.init(master= "local" )

转载于:https://www.cnblogs.com/xinping-study/p/6897920.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值